Page style:

A Blue Perspective: Mapping DTDs: making validation usable

Mapping DTDs: making validation usable
21 May 2004

» DTD Mapper «

One of the qualms people have with validating pages is that they never know what they're validating from. The W3C releases documents called Document Type Definitions (DTDs) that strictly specify the grammar and form that HTML/XHTML documents must take in order to validate as HTML 4.01, XHTML 1.0, or XHTML 1.1. In the not too distant future, web sites will be able to supply their own DTDs, letting them create valid XML documents with a custom syntax all their own. (IBM already uses one)

However, while DTDs are perfect for letting a computer know whether a document is valid, just try reading one with your own human eyes. I'd give you cheque for $50.00 if you could look at the XHTML 1.0 Strict DTD and tell me in five minutes what elements are allowed to be nested inside a cite tag. (Cheques will not be honoured)

As Andrei pointed out, the W3C don't help matters with their deeply hidden documentation and under-described recommendations. So, I've written a program that takes the leg work out of it for you. Using the DTD Mapper all you have to do is paste in the URL of the DTD that your pages are meant to be validating to (that URL in the weird DOCTYPE line at the top of your source code), and you'll get a nice, collapsible list of all valid elements, with their attributes, possible child elements and possible parent elements.

Because there's quite a lot of repetition in attributes, nesting, etc. the HTML files that the mapper produces are rather large (> 500 kB) so it takes a while for a read-out, during which the expanding/collapsing JavaScript won't work, so wait for the whole thing to download.

Of course, automated tools like this won't be able to tell you everything about the specifications for a document – there's a large gap that can only be filled by human description – but it should be handy for some of your queries. Any suggestions for improvements, just post them here!

Tested in IE6, Firefox 0.8 (Win). Thanks to Earl Hood for the perlSGML parsing modules.

Comments

1/21. 21 May 2004 @ 03:07, Simon Jessey wrote:

Wow, Cameron! That is a really nifty piece of work. You are getting a reputation for clever bits of programming.

One possible enhancement would be to allow local files to be read, much like the markup validator does here: http://validator.w3.org/

2/21. 21 May 2004 @ 05:39, Joe Clark wrote:

Try pasting http://www.literarymoose.info/=/dtd/xhtml11e.dtd in. I can't get it to work.

It seems the second field-- the pull-down menu with XHTML DTDs-- is not needed when trying to validate a custom DTD.

3/21. 21 May 2004 @ 10:42, ACJ wrote:

Dude, genius. This is so going into the resources chapter of my static bookmarks.

4/21. 21 May 2004 @ 11:28, The Man in Blue wrote:

Joe: I think the parser I''m using has trouble with the XHTML 1.1 method of importing different modules, that's why I used the XHTML 1.1 Flat file in the drop down menu.

5/21. 21 May 2004 @ 14:53, Justin French wrote:

Mate, amzing idea, works well in Firefox, but doesn't work in Safari... I get the collapsed items, but can't expand any of the tree nodes... JS is definitely on.

6/21. 21 May 2004 @ 18:26, Matt Pennell wrote:

Incredible work - although I don't fancy accessing the page on my home dial-up!

7/21. 22 May 2004 @ 00:24, Giacomo wrote:

Definately helpful. Given the nice tree structure your script builds, how hard would it be to put up a quick search for "which tags can contain this tag"? At that point, you could ask the W3C some money for building one of the most useful tools ever ;)

8/21. 22 May 2004 @ 01:29, Marco "Bazzmann" Trevisan wrote:

That's the killer application!

Good work on you. Since the complexity of data that mapper has to handle, and humans have to use, I think that a future improvement to the GUI of mapper could improve the usability of it better than now. This is just my opinion and my two cents about it. :)

If you wish, I could try to think something about it.

9/21. 23 May 2004 @ 01:45, Henrik Lied wrote:

Pretty nice dude. I keep getting more and more amazed

10/21. 26 May 2004 @ 00:06, Unearthed Ruminator wrote:

I agree with Giacomo, if you could search and/or filter this list, it would be even more amazing. Great work.

11/21. 27 May 2004 @ 21:37, Chris wrote:

Regarding the search, isn't that what the Parents tab does?

This is a very handy utility indeed :)

12/21. 27 May 2004 @ 22:13, Chris (again) wrote:

Something strange I just noticed though - the ul element is showing as having no children on XHTML Transitional 1.0.

However if you view the li it shows a parent of ul :)

13/21. 29 May 2004 @ 22:19, Hallvord R. M. Steen wrote:

Looks very nice but the script runs into problems with nextSibling being a whitespace node in Opera. This may be the problem someone mentioned with Safari as well.

14/21. 30 May 2004 @ 01:51, Anonymous wrote:

Hmmmm ... what's a whitespace node?

15/21. 30 May 2004 @ 02:25, Hallvord R. M. Steen wrote:

Spaces, linebreaks and TABs between tags may be parsed as individual nodes in the DOM (I'm not sure what determines such parsing)

Cameron's JavaScript seems to fail in some browsers because the script changes the classname of the node you click on and the next node, expecting both to be elements. In some browsers the next node is in fact just linebreaks and tabs and changing its classname does exactly nothing. It is a small thing to fix - just need to check if the nodeType of the node is 1 and keep looking if it is 3.

16/21. 29 July 2004 @ 08:04, Kyrre wrote:

I can't seem to get it to work... Every DTD I try, I get an error message, here's the one from XHTML 1.0 Transitional from the drop-down list:

ERROR: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd could not be parsed as a DTD.

17/21. 29 July 2004 @ 11:05, The Man in Blue wrote:

Yeah, I think when I moved servers that one of the Perl packages went missing, I'll have to take a look.

18/21. 10 August 2004 @ 02:22, The Man in Blue wrote:

OK, it's back up now!

19/21. 10 August 2004 @ 02:42, The Man in Blue wrote:

And as an added extra bonus, the collapse/expand JavaScript now works in Opera :o]

20/21. 13 December 2004 @ 06:10, Tom Levine wrote:

What about the Pull Model, which provides non-cached read only access to the .XML file, and the XmlReader() object. Using XMLValidatingReader(), a derivative of the XmlReader() you can parse with DTD.

21/21. 31 January 2005 @ 03:57, Michael wrote:

Great stuff!

The attribute listing is great, but I'm not getting parents or children for any element in any DTD.

Just me?

Post your own comments

Fields marked with an asterisk* are mandatory. All HTML tags will be escaped. http:// strings in comments will be auto-linked.