HTML is something you either love or you hate. And while there is no in-between, there is room for both. If you are technically minded, and have a tendency to take things seriously, love and hate collide between angle-brackets.

Who would use the less-than and greater-than signs as brackets, when computers already have three types of brackets?

We at Kill the World understand that History and Legacy are one. Let's get rid of it all and start over, shall we?

Egg-SGML: an interface between programmers and web developers

In the turmoil which followed the discovery that HTML was based on a standard which was not freely available, a few things seem to have gotten lost.

XML achieved two things: standardizing the underlying structure of HTML, and formalizing the document tree—something which SGML had brought about, but was rather lax over.

However, the requirement of compatibility meant that web-browsers continued, and still continue, to accept as input whatever they can see their way to making something out of. The introduction of HTML5 was a declaration of war against the strictness of XML.

Egg-SGML contains a parser which is similar to a web browser's, but allows us to expand on the document tree. This means that a reasonably competent web-developer can fully control the look of a dynamic website without resorting to programme code. No constraints are placed on the HTML or CSS.

Let me draw you a picture

As you know, parsing an HTML file builds a tree structure in memory: the document tree; so that it merely needs to be mentioned that the reverse of that process is called a depth-first tree search. That is, performing this search on a document tree outputs the document in HTML format. We now assume that the server is processing every HTML file in this fashion.

The first thing that is likely to occur to someone who has built plain HTML websites is the addition of an <include> tag. But this is not enough: often one is not able to separate headers and footers from the document easily, and the included file must be a complete document tree (in other words we cannot have <body> in one file and </body> in another).

What we can easily do is provide tags which store a subtree for later use, which we call <record> and <play>. When the <record> tag is encountered, the server skips over its entire subtree. <play> then adds this subtree back, and processes it as usual. If we use consistent identifiers in the files which include a common file, the common file will adapt wherever <play> occurs.

Time for the block diagram

We proceed from the description of a depth-first tree search routine. This routine in fact works on a stack, where each stack frame consists of a portion of a tree and a tag consumer. The tag consumer may add to this stack, such as happens with <include> and <play>. The result of stacking trees is a tree, so that we know our resulting document will be well-formed.

The project

The principles are compatible with any scripting language. The PHP implementation is available on Github and contains a few sample modules.

The demo website will show you how Egg-SGML can simplify static websites.

Share. MMXXI II XIV
A fortune-cookie for you: Political T.V. commercials prove one thing: some candidates can tell all their good points and qualifications in just 30 seconds.