Lezione 2: Supplement

A supplement for non-computer scientists reading Website Architecture Lezione 2: XML + XHTML.

This gives some focal points for the main PDF; that is to say, this is a supplement, not a replacement.

DTD (shorter) summary

All markup documents (in the languages we've seen: HTML, XML, XHTML) should have a declaration at the top that says which language it is and which tags to expect, the valid tag structures, etc. You can basically always copy & paste this declaration from a known good source (see the zip file of examples for this lesson). The exception would be if you're making your own XML format and you make a document which describes your custom grammar so that you can validate new XML documents and see if they conform. Rare, especially for non-computer scientists.

There are three different such declarations for HTML 4.01 and the same three for XHTML 1.0. You should generally use the "Strict" vairant of each, unless your web page uses frames. When you are copying & pasting, make sure you skim the cryptic-looking text and see that it's "Strict".

HTML grammar vs. XHTML grammar

Extensibility benefit of XHTML

XML documents can include other XML documents. Sometimes it is useful to take a seemingly unrelated XML format, like MathML, and embed it in a web page. This is fairly rare in websites but certainly not unheard of. This kind of design goal/constraint usually comes with others that are more complex, so it seems unlikely that you would be responsible for this type of job any time soon.

Economy benefit of XHTML

Since XHTML mandates a more regular/standarized grammar, it is easier for computers to read than HTML, at least slightly, and this is an advantage when "embedded computers" (smaller, often purpose-specific computers, such as in mobile devices) need to read (X)HTML documents. HTML has become a pretty messy language over the years, including especially backwards-compatibility with old versions of HTML, so the programs that read HTML tend to be fairly complicated. XHTML is a fresh start and programs which read XHTML can be written more easily and conservatively. XHTML is already preferred in some media such as eBooks.

Drawbacks of XHTML?

Not really, son. Can't be a slob, though.

Future directions & HTML5

HTML5 is on the horizon, and we will see it later in the course. HTML 4.01 and XHTML 1.0 are over ten years old, and HTML5 incorporates some features from both of them, such as the embedding of other XML. For your purposes in real life, making typical websites, you should probably be using HTML5, so the debate of HTML vs. XHTML at this point does not hold as much relevance as it would have three or five years ago; the answer is now pretty much to use HTML5. HTML5 is largely intuitive if you know HTML 4.01, so there is no problem in continuing on our current course.