Programmer's Introduction to XML
Written by Ian Elliot   
Thursday, 28 December 2017
Article Index
Programmer's Introduction to XML
Attributes & Checking XML
A Glossary

Where Next?

Now you know how XML works to describe data so that it can be exchanged, transferred and displayed in a completely universal way.

Of course we have just scratched the surface and things get even more interesting but detailed when we move into any one of the specific XML application areas. You not only have to deal with the ideas of XML but the additional specification that have been added. For a programmer using existing standards the best next step is to learn about the DOM API in what ever language you use the most.


A Short XML Glossary


An XML document contains pairs of opening and closing tags surrounding text that you can regard as the data. An opening tag is simply a word in angle brackets <start> and a closing tag identical but with the addition of a backslash. So the closing tag to <start> is </start>. It is also useful to know that you can’t include spaces within a tag.

Tags always occur in pairs unless you don’t need to include data between the tags, in which case you can indicate the closing of a tag by adding a backslash to the end. For example, <start/> is an opening tag and its own closing tag.

XML Documents

A valid XML document starts with a single tag and ends with a closing tag. All of the other tags within the document have to be nested between the opening and closing “top level” tags.


Tags can have “attributes” within them to record information about the type of data between the tags or to modify the interpretation of the tags. Attributes always take the form name=value and you can invent attributes just as freely as you can invent XML tags.

Schema and DTD

A Schema, and the older technology a DTD, is a document that describes the grammar of an XML document. If you provide a schema and a document to an XML-aware application then in most cases the application will be able to work out if the XML document is correct or contains errors, even though it might not “know” anything more about the way you are using XML.

Name Spaces

One of the most mysterious parts of XML is the concept of a “name space”. The big problem with all systems that allow users to invent their own identifiers is that we tend to invent the same identifiers over and over again. For example, in many XML documents we are likely to invent a <name> tag but not all <name> tags are going to mean the same thing.

To avoid name clashes you can use a namespace declaration in the form of the xmlns attribute. To set a namespace for the entire document you might use something like:

 <book xmlns="http//www.mywebsite">

Following this the namespace

" http//www.mywebsite"

applies to everything contained within <book></book>, unless of course an inner tag declares its own namespace.

Notice that the namespace is a URL. The only reason for this is that you are supposed to possess a unique URL so no-one else will use it. There is no sense in which the URL has to correspond to a relevant web page (although it can), it’s just a tricky way of getting a unique identifier.

With a namespace applied you can think of every name between the tags that it applies to as being prefixed with the namespace

e.g. http// http//www.mywebsite:name

If you want to make explicit the namespace an identifier belongs to then you can actually write it as a qualified name namespace_prefix:name. The namespace_prefix is supposed to be unique, so all the names now used in the document are unique.



Related Articles

XML in C#

Linq and XML






or email your comment to:

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


From Data To Objects

What are objects really all about? The data stupid. We don't give data enough credit for it role in programming. Perhaps we shouldn't call it coding but datering or something. The relationship between [ ... ]

The Magic Number Seven And The Art Of Programming

The number seven is very important in programming and many other intellectual endeavors. Why is is magic and what significance does it have for us poor limited humans?

Other Articles



<ASIN: 067232797X>

<ASIN: 0321559673>

Last Updated ( Thursday, 28 December 2017 )