Page 4 of 4
Now you know how XML works to describe data so that it can be exchanged, transferred and displayed in a completely universal way.
Of course we have just scratched the surface and things get even more interesting but detailed when we move into any one of the specific XML application areas. You not only have to deal with the ideas of XML but the additional specification that have been added. For a programmer using existing standards the best next step is to learn about the DOM API in what ever language you use the most.
A Short XML Glossary
An XML document contains pairs of opening and closing tags surrounding text that you can regard as the data. An opening tag is simply a word in angle brackets <start> and a closing tag identical but with the addition of a backslash. So the closing tag to <start> is </start>. It is also useful to know that you can’t include spaces within a tag.
Tags always occur in pairs unless you don’t need to include data between the tags, in which case you can indicate the closing of a tag by adding a backslash to the end. For example, <start/> is an opening tag and its own closing tag.
A valid XML document starts with a single tag and ends with a closing tag. All of the other tags within the document have to be nested between the opening and closing “top level” tags.
Tags can have “attributes” within them to record information about the type of data between the tags or to modify the interpretation of the tags. Attributes always take the form name=value and you can invent attributes just as freely as you can invent XML tags.
Schema and DTD
A Schema, and the older technology a DTD, is a document that describes the grammar of an XML document. If you provide a schema and a document to an XML-aware application then in most cases the application will be able to work out if the XML document is correct or contains errors, even though it might not “know” anything more about the way you are using XML.
One of the most mysterious parts of XML is the concept of a “name space”. The big problem with all systems that allow users to invent their own identifiers is that we tend to invent the same identifiers over and over again. For example, in many XML documents we are likely to invent a <name> tag but not all <name> tags are going to mean the same thing.
To avoid name clashes you can use a namespace declaration in the form of the xmlns attribute. To set a namespace for the entire document you might use something like:
Following this the namespace
applies to everything contained within <book></book>, unless of course an inner tag declares its own namespace.
Notice that the namespace is a URL. The only reason for this is that you are supposed to possess a unique URL so no-one else will use it. There is no sense in which the URL has to correspond to a relevant web page (although it can), it’s just a tricky way of getting a unique identifier.
With a namespace applied you can think of every name between the tags that it applies to as being prefixed with the namespace
e.g. http// http//www.mywebsite:name
If you want to make explicit the namespace an identifier belongs to then you can actually write it as a qualified name namespace_prefix:name. The namespace_prefix is supposed to be unique, so all the names now used in the document are unique.
XML in C#
Linq and XML