Introducing XML
Article Index
Introducing XML
Practical XML
XML technologies

 

You don’t have to be a programmer, or even particularly technical, to have encountered XML. It turns up in connection with new versions of most applications, which are now usually said to feature “improved XML functionality”. XML is also replacing HTML as the language of the Web, it is becoming the standard format for database files and queries and just about every type of file you can think of and it’s also at the heart of many of the newer programming facilities such as web services and reusable components. In short – it’s everywhere!

Given that XML seems to be everywhere you might think that it is going to be complicated and a quick look at an XML file might well confirm this view. But the basic idea of XML is shockingly simple.

The fact that it leads on to things that look complicated is just an indication that it is a simple but powerful idea. It seems like a good idea to spend some time mastering it.

Tags, nothing but tags

EXtensible Markup Language, to give XML its full title, is a way of indicating where different parts of a document start and end. For example, if you were keeping a list of your favourite books you might well use something like:

 Title: Life of Pi
Author: Yann Martel
Publisher: Canongate

You are using the convention that a colon separates a “field name” that describes and identifies the data from the actual data. XML uses a different, but just as obvious, convention to do the same thing. It uses field names enclosed in “angle brackets”, or tags, and in this case the convention is that the actual data is between an opening and closing tag.

For example, the same book data in XML would be something like:

 <Title> Life of Pi </Title>
<Author> Yann Martel </Author >
<Publisher >Canongate</Publisher>

You can see from this that to each opening tag there is a closing tag of the same name but starting with /. The beauty of this system is that the layout of the document doesn’t make any difference and data can include line breaks without any problem. For example, this version of the document means exactly the same thing as the previous one:

 <Title>
Life of Pi
</Title>
<Author>
Yann Martel
</Author >
<Publisher >
Canongate
</Publisher>

The whole point is that XML can represent the structure in the data without it having to be laid out in any particular way.

For example, you could read the XML document out letter by letter over a phone connection and the person at the other end could write it down as one long string of text. It still means the same thing and the ability to “serialise” XML documents makes it possible to store them on disk or transfer them byte-by-byte, or even bit-by-bit, over a network without any special processing.

Nesting

Things can be a little more interesting than the example above suggests because you can use tags within the content of other tags. In particular, to be strictly correct, the XML example given earlier needs an outer pair of tags that enclose everything. That is:

 <Books>
<Title>
Life of Pi
</Title>
<Author>
Yann Martel
</Author >
<Publisher >
Canongate
</Publisher>
</Books>

Notice that the indenting has been used to show clearly that all of the other tags are within the <Books></Books> pair. By now you already know that layout is irrelevant to the meaning of an XML document, but it helps to make it readable. An XML document always starts with tag that encloses everything – a so-called “top level” tag.

You can repeat the “tag within tag” idea as often as you like and this one of the many things that makes XML powerful. It also makes XML look more complicated than it is.

For example a more complete record could be shown as:

 <Books>
<Title>
Life of Pi
</Title>
<Author>
Yann Martel
</Author >
<Production>
<Format>
Paperback
</Format>
<Pages>
348
</Pages>
<ISBN>
184195392X
</ISBN>
</Production>
<Publisher >
Canongate
</Publisher>
</Books>

Although this looks much more complicated, you can see that it is just the basic principle of using opening and closing tags to enclose data or other tags.

What might surprise you is that if you type this XML into a document and save it under a suitable name pie.xml say then you will be able to load it into Internet Explorer and view the structure of the data much more clearly.

Figure 1 shows exactly what it looks like when loaded into Explorer.

 

explorer

Figure1: Viewing XML

The power of XML

There isn’t much more to XML than tags that surround data and you might well be thinking that this isn’t enough to warrant the fuss?

However, you should be able to see that XML can be used to give structure to any data that you care to imagine. You are free to invent any tags that you need and use them to build structures that your data fits.

The resulting XML file should be understandable by another human and, crucially, it should be processed correctly by any program that understands XML. For example, even though Internet Explorer had no prior knowledge of the structure of our book data record it managed to do a reasonable job of displaying it.

All of this is fairly impressive but to see what the real potential is you have to imagine how things will be in the future. Imagine that all of the data in the world was recorded using XML. Now you can begin to see the reason for the excitement. XML makes it possible for data to be generated by one system and consumed by another. It allows the web to change from a presentation medium to a source of data.

All that we need to make this come true is to base everything that we do on XML and this is what most of the new facilities built into products such as Office, SQL Server and, of course, .NET are all about.

It is worth saying that the majority of XML documents are likely to have been produced automatically by an application and consumed automatically by another application. XML may be human readable but that doesn’t mean that humans are always the source or the intended destination for XML documents.

In most cases XML will lurk in the background, making some desirable facility work without you really being aware of why or how.

<ASIN: 1840783370>

<ASIN: 0596007647>



 
 

   
Banner
RSS feed of all content
I Programmer - full contents
Copyright © 2012 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.