XML Basics

XML is not a programming language because it does not create applications, and uses a text editor.XML is not a database that store the data. XML is a structured document format which carry data and metadata. A metadata contains information about the data.

XML uses markup tags to self-describe the data and metadata. There is an opening tag and closing tag to enclose the text information. These tags are XML elements.

For example,


It is self-describing XML tags and Fraser is the name of the dog.

Example XML document

<?xml version="1.0"?>
     <Disk space></Disk_space>

Output in Browser

You must save the above document as file.xml and when you open the file in a browser. The XML is displayed as given in the following figure.

Output - XML Document -XML tutorial
Figurre 1 – Output XML Document

Difference between HTML and XML

There is a lot of difference between XML and HTML even though they both are markup languages. The list of difference between XML and HTML is given below.

Purpose of HTML is to present information using structure (<h1>, <p>) and appearance (<font>, <b>).Purpose of XML is to store data and share the data. It also define the content using metadata.
HTML comes with predetermined tags.You can create own xml tags.
HTML is not strict, XHTML is only standard followed strictly.XML document must be well-formed, and if valid, must be validated using DTD or Schema.
HTML is for humans only.XML is used by both humans and machines. Application can exchange data using XML document format.

XML Document Structure

The example document we saw earlier contains following sections.

  1. XML Declaration
  2. Document type declaration (! DOCTYPE)
  3. Element data
  4. Attribute data
  5. XML content

Now, we will describe each one of them briefly.

XML Declaration

There are lots of XML documents available that are similar to XML. How do we identify which one is XML? The XML declaration describe the type of document we use. The information is contained within <?xml … ?> tag. It means we are using XML document and its version is “1.0”. For more information, learn from previous lesson about XML declaration.

The XML declaration contains following components

Component of XML declarationMeaning
<?xmlMark the beginning of XML declaration
Version=”xxx”Says about the version of XML used in the document.
Standalone =” xxx”It can be “Yes” or “No” which means that the document can contain external markup declarations. You can use it for including DTD statements inside the documents.
Encoding=”xxx”This contains the character encoding used in the XML document.

XML Document Type Declaration

The XML documents are well-formed which means they are syntactically correct. A well-formed XML document can include document type declaration (DOCTYPE), although it does not require it.

A well-formed and valid XML document must include a document type declaration (DOCTYPE) with two information.

1. Root element name

2. Path to external dtd file or internal dtd

The DTD stands for document type definition which checks for constraints on a XML document and confirms the validity of the document. The main purpose of DTD is to ask a XML parser to validate the document instance with a document model, called the validity checking. This is completely optional, we will learn about DTD later.

Figure 1 - CML Document Type Declaration Structure
Figure 1 – CML Document Type Declaration Structure

The general syntax for document type declaration is given below.

<! DOCTYPE name_of_root_element SYSTEM "name of external dtd">

<!DOCTYPE – the syntax begins with doctype string.

Name_of_the_root_element – next you have name of the root element of the XML document.

Uri_of_the_dtd_file – Whether you are using a local DTD file, or an external DTD file located on internet, you should mention the URl path to the file.

Internal DTD codes – The internal DTD codes are enclosed between opening and closing square brackets ([ ]). These codes are either DTD declarations or entities declarations. The internal codes are either for adding new codes including DTD files or to change DTD codes mentioned in the DTD files outside.

For example

<! DOCTYPE student SYSTEM "student.dtd">

Where student.dtd is a non-public dtd file.


<! DOCTYPE name_of_root_element SYSTEM [ ]">

You can put your dtd code inside the square brackets.

For example



//your dtd codes


Element Data

Element contains the contents of the XML document. A matching pair of tags is called the element and consists of XML contents. Some tags are self-closing, it means there is no pair of tags.

Elements are part of the main XML document, and they are displayed differently. Some elements contain other elements (nested elements) and text, while some elements only contain text information.

Figure 2 - XML Element with content
Figure 2 – XML Element with content

For example,

<Car> Toyota </car>  is an element with Toyota as the text content. The elements can also be nested.


<Model> ET900</Model>

<Price> </Price>


Now car has sub-elements and there could be more levels like this, and there is no restriction. Note that the price is empty, and XML does not find the difference between and empty or non-empty elements.


You can put additional information within an element called an attribute.

For example,

<price currency="USD"></price>

XML Contents

XML document can contain any type of content as long as it valid according to XML metadata information. The XML document can contain any amount of content that could be hundreds of megabytes of information.


Ron Schmelzer, Travis Vandersypen, Jason Bloomberg, Madhu Siddalingaiah, Sam Hunting, Michael Qualls, Chad Darby, David Houlding, Diane Kennedy. 2002. XML and Web Services Unleashed. Sams.

www.w3schools.com. n.d. XML Tutorial. Accessed May 16, 2018. https://www.w3schools.com/xml/default.asp.