HTML - is it really going to hack it?

Written by Ian Elliot

Friday, 23 September 2011

We focus on the limitations of HTML when it comes to programming and discover that it isn't there yet - by quite a long way.

Microsoft's Window 8 Metro puts HTML5 in direct competition with XAML and it highlights the deficiencies in HTML and the whole web development environment. Put simply you can now choose to develop Metro apps with either C# and XAML or JavaScript and HTML5. The ability to pick the tools to do the job focuses the mind on their differences - and HTML, 5 or any other number doesn't look too good close up.

It is part of the legacy of HTML as a pure markup language that it isn't really suitable as a way of providing a GUI in conjunction with a programming language. Not even HTML5 manages to do the tricks that you can create with say XAML or MXML.

The reason is that "proper" programming environments started out with the programming language and added a code based framework to create the UI. That is, when you want a button to appear you create a Button object, set its properties and and add it to the display surface. Something like:

Button mybutton= new Button();
mybutton.width=100;
etc..

The logical way to provide such a language with a declarative layout language is to simply create a declarative object instantiation language. For example in XAML you might write something like:

To create an instance of the Button class. Notice that the creation of the Button object doesn't add anything to the code - it is just a shorthand for "create this button". Also notice that you don't have to accept the XML like syntax of XAML - but it has big advantages.

Object instantiation languages also generally allow you to set properties of objects and even, to a limited extend call methods to set properties. How you indicate the setting of properties is a matter of the syntax you want to invent but XAML and others make use of XML attributes. For example:

creates the Button object and initializes its width property to 100. That is it is exactly the equivalent of the code given earlier. Attributes can also make the connection between the objects that are created and procedural code in other files. For example, you can specify the Name attribute as in:

<Button Name="mybutton" width="100>
</Button>

Code being created in other locations can now use the identifier mybutton safe in the knowledge that the instance with that name will be created by the instantiation language before the code ever gets to run. The IDE can also read the XAML and check that references in the code are valid.

The fact that XML is good at describing tree like hierarchical structures is also made use of. After all many properties have objects as their "value" and so nesting object creation makes it possible to create complex objects with sub-objects and so on. You can elaborate this basic scheme with "small languages" that allow the programmer to do other "wiring up" jobs such as attaching resources, databinding and even event handling.

The point is that an object instantiation language like XAML doesn't do anything that "damages" or reduces the effectiveness of the code.

It simply moves a whole chunk of what looks like declarative code into a declarative language.

Now consider HTML. It was invented with no thought of programming in mind. In fact, it just had markup tags like <H1> to signify a heading and in this sense it was entirely declarative The only sort of activity in an HTML page was the occasional hyperlink that simply turned the pages into a linked graph. Only later were seemingly more active tags such as <INPUT> created. This was just a little before JavaScript was created and originally all form data was sent back to the server to be processed. Even today, the isolated button outside of a form tag doesn't really have a first class status as something that should appear in an HTML page on its own.

The point here is that the "button" tag:

 <INPUT Type="button"...>

wasn't really introduced as a way to create a button object for a programming language to work with.

HTML was not, and still isn't, an object-instantiation language.

The big difference is that HTML isn't integrated with a programming language. It isn't just read in and used to instantiate objects in the programming language's environment. It uses a layout engine to read the tags and perform operations on a display surface. Unlike a programming environment, there is no direct access to the display surface - it belongs to the layout engine. So the HTML is read in and rendered to the display surface by the layout engine and no objects are created or harmed in this process.

Then along came JavaScript and we needed a way to interact with the "page", i.e. the graphics on the display surface. The solution, a natural one as JavaScript is object oriented, was to derive an object hierarchy from the HTML tags used and the way that they are laid out. This is a complicated business because it doesn't just depend on the static tags but on the layout engine and the way it processes them. The object hierarchy is of course the DOM, Document Object Model, which was introduced along with JavaScript in 1996.

DOM objects are not JavaScript objects and this causes a problem when it comes to more advanced web programming, in particular when you start to try to extend the components available to build the UI. The problem is that as HTML isn't an object instantiation language you can't add new tags. Also, as the DOM is separate from the JavaScript object hierarchy, it can't be extended in a native way. That is, you can't add a JavaScript object to the DOM hierarchy. Finally, as only the layout engine has access to the display surface, you can't draw directly on it.

HTML5 hasn't done anything for any of these objections. About the only thing it has done is to bring us a component - the Canvas element - that we can draw on. This can be used to create new components, as can CSS via customization of the <a> element. Even so this isn't natural from a code point of view and it isn't easy.

There are moves by W3C, and WHATWG in particular, to create a system of Web Components. Currently the only operational technology we have for this is Mozilla's XBL which is specific to the Gecko layout engine. Currently the Component Model group doesn't even seem to have decided which way to go in building something - to adopt XBL or to invent something new. Only time will tell if it has a good idea or even if the idea will become anything in the real world.

The key difference between HTML and say XAML is that HTML cannot be extended to create anything new. For example try adding a new layout manager to HTML? You can't. You can assemble new components together as long as they are simply aggregates of what actually already exists within HTML, but you can't start from scratch and build something really new and this is fundamentally more restrictive than object-instantiation languages.

In the world of HTML5 code is still a second class citizen trying to get to the top of the pile.

If you would like to be informed about new articles on I Programmer you can either follow us on Twitter or Facebook or you can subscribe to our weekly newsletter.

Last Updated ( Friday, 23 September 2011 )