In October of 2006, I published a brief article about Marcel Salathé’s interesting Java Applet to generate node graphs of web page structure. In that article, I stated:
I’d love to be able to produce graphs where I chose the color coding pattern for particular tags. I could set all non-semantic tags to be bright red, to easily spot the condition of a site in that respect. I could focus my attentions on inline versus block elements, or I could differentiate between different levels of headings.
More recently, I received comments on that post from a visitor who thought my idea to change this was a good one — - so, at long last, I’ve gotten around to doing it.
Semantic HTML Graphs
And here’s an example of output:

The graph pictured here is for Metrolinx, the Greater Toronto Transit Authority — - and Joe Clark’s failed redesign of the year. It makes for a pretty interesting case study. I know the output is small; but bear with me.
In this graph, you can clearly see long strings of orange nodes, which indicate nested table elements. You can also see significant clusters of bright red nodes, indicating deprecated tags. Altogether, the site is a maze of primarily long wavelength colors. In general, in the color scheme I’ve set up, greater densities of long wavelength colors (red, orange, pink) shows a dependence on tables for layout and presentational elements. Short wavelength color (blue, green) indicate more semantically meaningful structures.
I made a number of small changes to the script which I think add value. First, I added the ability to change the root node you’re mapping. I don’t know that this is incredibly valuable, but it does provide an interesting alternate piece of information. The node switching is limited; it will only check the first node specified of that particular content type.
The second change is to provide a variety of color schemes. The default is pretty complicated, although I drew the line well before attempting to provide a subtly different shade for every single element. I hope that the colors provided at least give you an idea of what you’re looking at, however. The alternate color schemes (two, at the moment) are much simpler: one which simply differentiates between allowed and deprecated elements and another which highlights all inline elements (a, dfn, samp, etc.).
Now, I’ve never programmed in Java before, and although the changes I made to Sala’s source code are relatively slight, it’s highly probable that there are bugs; and I’ve certainly not managed to remove any bugs from the original code.
The last thing I need to mention is concerning the accessibility of this applet. It’s just not accessible. In fact, I know little about how to make Java accessible in the first place; but even so, the entire concept of this applet is highly dependent on color. There can be no question that if you are color-blind or otherwise sight impaired this will be a problem. Additionally, there is absolutely no means present for any screen-reader to understand the input. I do hope to change this at a later date, and author a text-based output which will provide a separate, accessible interface with the information, but that just hasn’t happened yet.
Also worth looking at:
- Validation Graphs - a stand-alone Java application also based on the HTML graph script which spiders pages and checks them for validity.
- Web2DNA - same basic idea, different implementation.
The justification that a web site is accessible because it “follows standards” contains a serious fallacy. Specifically, the assumption that standards support accessibility.
One root of current standard accessibility practice is conformance to the HTML or XHTML standards set by the World Wide Web Consortium (W3C). This is a fine practice, and certainly should be maintained. Using correct syntax and following a standardized method of communicating information is always a solid best practice. However, this should absolutely not be taken to mean that following these standards is the same as applying the principles of web accessibility.
Web standards only provide accessibility to the degree that they have been designed to do so — - and the guiding principle behind standards development (excluding accessibility-specific standards, of course) has not generally been to support accessibility. Web standards have been designed purely to establish a set, correct method of using the underlying code — - whether presentational (CSS), structural (XHTML) or behavioral (ECMAscript.)
In many (most) cases, web standards do not in any way require best practices — - they merely require conformance. Take HTML, for example. Web standards would permit the usage of table elements for layout, because they do not define semantic usage for the table element. Web standards also permit a variety of presentational elements, such as font, strike, or u. It all depends on what standard you have chosen to follow.
HTML5, most recently, is considering such contrarian steps as removing the requirement that alt attributes be required for images. This ensures the existence of a valid HTML5 web site which can radically fail basic accessibility guidelines. On the other hand, it may reduce the likelihood that some so-called “accessible” web sites will be littered with alt="this is a spacer graphic".
Does this necessarily mean that the standard is wrong or right? No, not as such. Different standards support different needs — - it is important to keep distinct the purpose of the standard. Conforming HTML is just that: Conforming HTML. It means nothing more.
Nonetheless, as an accessibility advocate, I feel that it’s important to support accessibility issues within the development of new standards. Taking the alt attribute issue in HTML5, for example, the lack of any perceived benefit to not requiring the attribute suggests to me that the better path would be to continue to require it. There are numerous examples of important accessibility aspects in HTML5 which are not yet included.
There seems to be a strong element of specious judgement: elements which are not supported by current user-agents are considered not to be needed. This seems a ridiculous expectation: after all, if unsupported elements aren’t needed, than why develop a new specification at all? What we’ve got must work just fine!
Practically speaking, user-agent support and developer use should both be only marginal issues when trying to decide what elements are most needed in a specification. The fact that elements are unused on either end are not a judgement on the value of that element; merely a judgement on the awareness of the element, on the clarity of the existing specification, or on the complexity of the implementation.
Nobody (or almost nobody) uses the q inline element. Does this mean that the element isn’t valuable, and should be discarded? No. It means that Internet Explorer should add appropriate support for it. The same is true for accessibility issues. The standards should support them to their best abilities: if an element or attribute could hypothetically add to the accessibility of a site, then the fact that it is little used or poorly supported should be entirely irrelevant. Support should follow the standards; not the other way around.
At the root of things, my stance is that I am unwilling to support a standard which specifically excludes features which are needed in order appropriately provide best-practice accessibility. HTML5 is still a long way from being done; and even further from being implemented (if it ever is,) but the removal of such attributes as the header from table markup, the inclusion of defined non-semantic elements such as b, and the “WYSIWYG exemption” on the font element strike me as decisions badly in need of reconsideration.
Filed under Semantics, Web standards by Joe Dolson