Programmer's Introduction to XML
Written by Ian Elliot   
Thursday, 28 December 2017
Article Index
Programmer's Introduction to XML
Attributes & Checking XML
A Glossary

Where Next?

Now you know how XML works to describe data so that it can be exchanged, transferred and displayed in a completely universal way.

Of course we have just scratched the surface and things get even more interesting but detailed when we move into any one of the specific XML application areas. You not only have to deal with the ideas of XML but the additional specification that have been added. For a programmer using existing standards the best next step is to learn about the DOM API in what ever language you use the most.


A Short XML Glossary


An XML document contains pairs of opening and closing tags surrounding text that you can regard as the data. An opening tag is simply a word in angle brackets <start> and a closing tag identical but with the addition of a backslash. So the closing tag to <start> is </start>. It is also useful to know that you can’t include spaces within a tag.

Tags always occur in pairs unless you don’t need to include data between the tags, in which case you can indicate the closing of a tag by adding a backslash to the end. For example, <start/> is an opening tag and its own closing tag.

XML Documents

A valid XML document starts with a single tag and ends with a closing tag. All of the other tags within the document have to be nested between the opening and closing “top level” tags.


Tags can have “attributes” within them to record information about the type of data between the tags or to modify the interpretation of the tags. Attributes always take the form name=value and you can invent attributes just as freely as you can invent XML tags.

Schema and DTD

A Schema, and the older technology a DTD, is a document that describes the grammar of an XML document. If you provide a schema and a document to an XML-aware application then in most cases the application will be able to work out if the XML document is correct or contains errors, even though it might not “know” anything more about the way you are using XML.

Name Spaces

One of the most mysterious parts of XML is the concept of a “name space”. The big problem with all systems that allow users to invent their own identifiers is that we tend to invent the same identifiers over and over again. For example, in many XML documents we are likely to invent a <name> tag but not all <name> tags are going to mean the same thing.

To avoid name clashes you can use a namespace declaration in the form of the xmlns attribute. To set a namespace for the entire document you might use something like:

 <book xmlns="http//www.mywebsite">

Following this the namespace

" http//www.mywebsite"

applies to everything contained within <book></book>, unless of course an inner tag declares its own namespace.

Notice that the namespace is a URL. The only reason for this is that you are supposed to possess a unique URL so no-one else will use it. There is no sense in which the URL has to correspond to a relevant web page (although it can), it’s just a tricky way of getting a unique identifier.

With a namespace applied you can think of every name between the tags that it applies to as being prefixed with the namespace

e.g. http// http//www.mywebsite:name

If you want to make explicit the namespace an identifier belongs to then you can actually write it as a qualified name namespace_prefix:name. The namespace_prefix is supposed to be unique, so all the names now used in the document are unique.



Related Articles

XML in C#

Linq and XML






or email your comment to:

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.


What is a Turing Machine?

The Turing machine can compute anything that can be computed. It is the very definition of computation and the fundamental tool for reasoning about computers. You really need to know what it is all ab [ ... ]

Dates Are Difficult

Date and times follow their own regularities, and they have nothing at all to do with binary, or even simple decimal, counting. First, clock and watch makers had to find ways of working with hours, mi [ ... ]

Other Articles



<ASIN: 067232797X>

<ASIN: 0321559673>

Last Updated ( Thursday, 28 December 2017 )