Thanks to the power of internet criticism, the code discussed in this blog post has since been fixed! Sometimes just making a complaint is all it takes to get something fixed. I was highly critical of the code authors for this low-quality code; but they truly did care, and made changes. Thank you.
Authoring forms is an important part of keeping the web fully accessible — not just providing access to information, but allowing users to fully interact with the web in all it’s glory. Interactivity is what makes the web powerful and persuasive.
As such, I can’t help but be frustrated when I run across basic form construction which is simply well below the standards I’ve come to expect. A form like this one, for example, is incredibly irritating to my sense of what the web should be:
<form action="/store/add_event_to_cart/53" autocomplete="off" method="post"> <table>
<tr>
<td nowrap><span class="required">First Name:</span></td>
<td><input id="attendee_first_name" name="attendee[first_name]" size="40" type="text" value="" /></td>
</tr>
<tr>
<td nowrap><span class="required">Last Name:</span></td>
<td><input id="attendee_last_name" name="attendee[last_name]" size="40" type="text" value="" /></td>
</tr>
...[numerous similar fields deleted to avoid boring the hell out of you]...
<tr>
<td colspan="2"> </td>
</tr>
<tr>
<td colspan="2">
<input name="attendee[sponsor_email]" type="hidden" value="0" /><input checked="checked" id="attendee_sponsor_email" name="attendee[sponsor_email]" style="vertical-align: top;" type="checkbox" value="1" />
<p style="display: inline-block; width: 360px;">Please sign me up to get occasional information from select sponsors, partners, and other fun people.</p>
</td>
</tr>
<tr>
<td colspan="2">Discount code (if applicable): <input autocomplete="off" id="attendee_discount_code" name="attendee[discount_code]" size="10" type="text" /></td>
</tr>
</table>
<input name="commit" type="submit" value="Add Attendee" />
</form>
But in this case, it’s not just the nature of the form itself. There’s a lot wrong here — the use of table for layout is a big problem, but even if you’re accepting the table as logical (and there is a particular logic which would except tables for forms,) the lack of a summary or headings in that table and the use of empty table cells to provide spacing is a big problem. Then we look at the form itself — not a label element in sight; instead we have plain text using a span and class to indicate if a field is required. There’s no coded indication that a field is required; it’s a purely visual indicator.
My sense of accessibility hurts.
And do you want to know where this code came from? Here it is.
Here are a few good articles on high quality form construction – but don’t bother reading them. After all, they didn’t.
This is something that pisses me off; but you can find it everywhere. Large organizations responsible for web publishing don’t always maintain the standards they talk about. Is it just talk, then? Does the fact that An Event Apart does what A List Apart condemns mean following standards and implementing accessibility doesn’t mean anything?
Thankfully, no. It does mean that web sites aren’t perfect; and the people doing the labor are frequently not the people who know best how it should be done. But it is a problem — as much as we can evangelize best practices, it doesn’t mean that they’ll be used.
There’s a lot of pressure in the web industry to produce fast results. Sometimes this means people take shortcuts; sometimes it means hiring people who may not be as fully trained or qualified as you really wish you had; and sometimes it means things just go wrong.
But I’m left with a definite feeling of frustration to find that a leading web standards event like An Event Apart should exhibit this kind of HTML on their web site.
How can this be avoided?
Ooh, that’s a tough one. Work processes, new employees, insufficient testing — all of these can allow inferior code onto a site. As a freelance designer, it’s positively rare that I have sole control over new content or template changes after the initial launch. As a member of a team, I can only imagine that it’s even more difficult — anybody with sufficient permissions to commit a change can change the overall level of competency exhibited on the site.
Application of a tool like Marco Battilani’s Big Red Angry Text technique can help, but it’s a little scary to put into a published site if you know that the editing won’t always be done by knowledgeable people. It may demonstrate mistakes, but can sometimes serve to do nothing more than piss off or frustrate your client or staff. It depends on the control and education you’ve been able to impose.
- Educate. Teach the people who will be doing work on the site as much as you can – the what and the why.
- Review the site. Review the work that’s been done; a 30 second glance at the code is likely to result in fixing at least some errors, and will hopefully prevent future errors of the same type.
- Provide tools for self-checking. Not a first choice, since all automated tools are flawed by their very nature, but they can still be of use.
It’s not always practical; but if following these steps is at all an option, it’s really worthwhile.
- Part 1 (Contracts, Site Requirements,Information Architecture)
- Part 2 (Hosting and Security)
- Part 3 (Navigation, Scent)
- Part 4 (Semantics, Structure vs. Design, Universal design)
- Part 5 (Interaction, Errors, and Administration)
So, we’re finally getting to the meat of best practice web development. This is what people are usually thinking of when they ask about best practices in web design or web programming: actually building the web site itself.
Web design best practices encompass a wide range of needs — everything from the visual look of the design and use of well-chosen markup to the implementation of alternate styles for mobile devices or print shows up in this area. Covering it in one article is, perhaps, ambitious. Fortunately, I’ve written on parts of this subject frequently in the past, so I’ll be providing a lot of links.
It’s important for best practices to clearly separate the structure of your web design (the internal labeling and definition of page elements) from the design elements (the appearance of these elements.) In the last article in this series, I discussed a few key elements of design: not in terms of color, layout, or typography, but in terms of communicating information.
Best practices ultimately leads to creating a universal or accessible design, and this practice hinges on two key concepts: web semantics and the separation of your structure from your design.
The Semantics of HTML
You can argue for days (or years, if you take look at the search results for “HTML semantics” or “web semantics”) on the detailed semantics of how HTML tags should be used. I’ve written on this several times, myself, including articles discussing the value of empty elements, the age-old debate between table-based or CSS layout, among many others.
Semantics are very important. However, when you really look closely at HTML you’ll inevitably notice that it’s not a strongly semantic language — the mark-up language doesn’t even come close to describing all possible uses of the tags. Many tags end up inevitably serving multiple functions.
So what web semantics really require is interpretation. The HTML specification provides one version of this interpretation, with suggested uses and meanings for elements. I’ve provided my own interpretation, as well. There are without question differences of opinion between those documents.
Obviously, you can argue very convincingly that any interpretation which disagrees explicitly with the HTML 4 specification is wrong. Feel free. The core of best practices in web semantics is to use them and make decisions: it’s about thinking, not specific rigor.
We need to differentiate, however, between the semantics of HTML and web semantics. The semantics of HTML are specific and defined: meaning as applied to the elements of HTML. This is a finite list of items, although the complete definition of meaning is less so. Web semantics, on the other hand, describe the application of meaning on the web. This is a more global concept, and applies to all aspects of your web development process.
Web semantics includes everything used to add meaning to your site, providing better comprehension of code and content. Using describe class and ID naming conventions, descriptive function names in server or client-side scripting, or providing helpful comments within your code can all be considered points of web semantics. Best practices means providing a site which is meaningful in both the front and back end.
For specific suggestions about element use, refer to my guide to semantic HTML.
Separation of Structure from Design
This is such an old question to harp on, but the importance of separating the organization of your page from the way it looks has never really flagged.
At a superficial level, it may appear that any markup you use has an effect on the appearance of your site. After all, there’s a clear visual difference between unstyled text marked as a heading and unstyled text marked as a blockquote! However, this visual difference only truly exists because the description “unstyled” is truly a misnomer.
If you disable stylesheets on a web site, you’ll see an extremely plain view of the site. It is not precisely “unstyled,” however — the design has simply been reduced to the default styles applied by the browser. In general, every browser has very similar defaults — but they’re not exactly the same. This is one of the reasons that it’s common to begin a stylesheet with a set of reset styles.
If you conscientiously remove the browser default styling, it can make your own development easier: the slight differences between browsers can then be ignored.
The point is that you should never place anything in your markup which exists purely to create a different appearance. Attributes or tags which define font faces, colors, or styles are obvious problems — but the use of small or strong can also be problems. It’s not that you should never use small: but your use of the element shouldn’t depend on the text being rendered smaller than the surrounding text.
It might not happen, after all.
This is one of the key complaints about using tables for design layout. A table is designed to organize information by providing easy access to it in a matrix. The columns and rows visual appearance of a table is a formality used because it is an expected way to view this type of data organization.
When you take a table and use it to layout your design, you are violating the separation of structure from appearance: your design is now dependent on the default organization of tables. Should somebody attempt to re-organize your table (for example, to linearize the information,) they may encounter a radically illogical data structure.
Fundamentals of Universal Design for the Web
The goal of universal design is very simple: make the information in your website available to every person or device which attempts to access it. This includes mobile devices, search engines, assistive technology, disabled users, and standard desktop browsers.
Universal design is where we bring everything above together. Attention to web semantics and a strong separation between structure and design give your web site at least a fighting chance of being universally usable. Obviously, you can still screw things up!
In the same way that following web standards doesn’t mean that you’ve made a web site accessible, following best practices for general web development doesn’t mean that you’ve made a site which will be great on a hand-held device or with a screen reader.
Different devices (like people) have different special needs.
Creating a web site which is truly universal requires you to be aware of the special needs of every device you’re working for — but a few basic principles will get you 95% of the way there.
The Principles of Universal Design provided by The Center for Universal Design at North Carolina State University are a good guideline for thinking about universal design. Although these principles are truly designed to be universal, in that they are intended to be applied well outside the realm of web development, the basic principles are sound in any context.
If you break the concept of universal design down to a single core issue, it could be that dependencies break access. Whenever you set up a situation in which a specific technical or design element must be present (a dependency on Javascript, a requirement that a control matches the description provided, or a requirement that a user must see a given image, for example,) then you are creating the potential for design failure. Avoid creating anything which depends anything out of your control.
Knowing what is and isn’t in your control (and, more importantly, what seems like it’s in your control, but really isn’t) is critical to best practices in web development. Acknowledging that although you can set the color of the text, you can neither guarantee that a visitor will be capable of seeing that color nor that the text will in fact be that color at the point that a visitor sees it is a critical step in understanding universal design.
Best Practices in Web Development: Part 5 (published on Friday, September 5th) covers interaction design, error management, and long-term site administration.
This guide only deals with elements which have a specific, human-readable meaning. The semantics of elements such as link, which are not seen in normal browsing, have been left out, as have replacement elements like img or object. In some cases, I’ve also addressed specific attributes which are critical to providing semantic value to an element.
This is not a guide which demonstrates the opinion of the W3C as represented in the HTML, XHTML, or HTML 5 specifications. This is a practical-use guide which indicates my reasoned opinion concerning the best use of each element.
Core Block Elements
div
- The
div element represents a discrete section of a page which can be meaningfully divided from the content around it. Commonly used to indicate a header region, footer, sidebar, or navigation region; it’s use can extend equally to indicate columns on a page or sections of an article. The element is also commonly used in multiple layers to group lower-level sections together, such as a “content” section which groups a main article, comments on that article, and meta data about the article or author.
h1-h6
- The six levels of headings are all used to introduce sections of content (containing
p (paragraphs), div (page divisions) or other content) which they describe. They’re perhaps most accurately compared to the structure of an outline: h1 is the top level heading element. The only heading element which can follow an h1 is h2. h2, on the other hand, can be followed by either an additional h2, if the sections are equivalent and both fall under the preceding h1 topic; an h3 if the following section is logically a child of the h2, or another h1 if the following section is a new topic of the same level of specificity as the first heading. A common preference (although certainly not mandatory) is to use only a single first-level heading on any page and to require all subsequent headings to descend from it.
p
- The paragraph element is the fundamental building block of prose text. It is also the most appropriate element for marking up a stanza of poetry or other similar discrete block of text. Different from a
div principally in that it is specifically intended to indicate text regions, whereas the div element is more broadly specified.
blockquote
- This is a very specific use element which should be used to indicate a significant block of text which is being quoted from outside the current source. It should always be paired with a
cite element to indicate the quoted source. It may also, optionally, use the cite attribute to contain a URI for the quoted text.
Supporting Inline Semantic Elements
a
- When accompanied by an
href attribute, the anchor element indicates either an external resource (a resource other than the current document) accessible via hyperlink or an anchored location within the same document. Using scripting, it can be used to perform more complex functions within the current page, but should always maintain a fall-back functionality to retain it’s semantic value.
abbr
- The abbreviation element generically indicates a shortened form of a more extensive term or phrase. It is inclusive of an acronym, although the lack of support for
abbr in Internet Explorer frequently forces developers to ignore that relationship.
acronym
- “Acronym” refers to a subset of abbreviations characterized by their formation from parts (letters or syllables) of the words they are used to abbreviate. The definition isn’t strictly agreed on, but it’s generally agreed that abbreviations formed by the removal of letters from a word are not acronyms.
em
- Indicates emphasis. “Emphasis” is a general indication that the emphasized text is in some way more significant than the text surrounding it. Whether a piece of text should be emphasized or not is usually dictated by authorial preference.
strong
- “Strong” is described officially as “Stronger Emphasis.” So, practically speaking, it’s an element you use in much the same scenario as you would use
em: an authorially determined preference for emphasis.
address
- According to the W3C,
address indicates contact information relevant to a specific document or part of a document. In practical usage, it’s more commonly used to indicate any block of contact information. As a block-level element, it’s generally reserved for significant blocks of information, rather than being used to mark-up a single e-mail address or telephone number.
cite
- A citation is fairly broad, and does not necessarily have to be associated with specific quoted information (although the reverse is not equally true.)
cite is associated with bibliographical information, personal quotations, or references to an external resource used in the research towards preparing a document.
code
- Indicates a sample of programming code as a general rule. The W3C specifications are clear that this is intended to refer to computer code; and I haven’t yet come across a situation where I needed to post encryption information which was not computer code.
dfn
- This is one of the more difficult to define elements — which is ironic, given that it’s intended to represent the “defining instance” of a term. It is not intended to contain a definition, it is merely intended to enclose a term at the point in a document where it is used in a definitive state. Sounds very legalese, to me.
del
- Represents information which has been deleted from a document. This should generally be used with date and time information indicating when the change was made, which can be included in the
datetime attribute in the following format: datetime="YYYY-MM-DDTHH:MM:SS". See also ins
samp
- Sample output from programs or scripts. Differentiated from
code in that the output of a program may not itself be code, but should still be indicated as an example of output.
span
- A generic inline-level HTML element. It should not be concluded that
span does not contain any semantic value, rather, that it is available to be used when no other element provides suitable meaning. It is preferable to use a generic element and define a meaning for it rather than use an element which has a pre-defined and inappropriate semantic meaning.
ins
- The opposite of
del, above. Represents inserted text following revisions.
q
- Indicates a shorter, inline quotation. Unfortunately, support for the
q element is minimal, and it cannot be readily recommended for any use.
kbd
- Indicates text to be entered by the user. Rarely used, but useful in circumstances where you are demonstrating the use of a program, along with
code and samp.
sub/sup
- Superscripting and subscripting of text can be used to indicate footnote references, valence numbers in chemical formulas (such as Fe+3), etc.
var
- Along with
code, samp, kbd, the “variable” element indicates a variable (or program argument.) It should be reasonably obvious at this point that this language was designed by programmers and not by librarians.
List Elements
ul, ol, li
- This is pretty straightforward: lists are used to represent grouped information best represented as a list.
ul is unordered, and is generally visually represented as a bulleted list. ol is ordered, and is generally visually represented as a numbered list. It’s common to attempt to apply lists at a significant macro level in organizing the elements in a form or, occasionally, within an entire page, but it’s my opinion that this kind of usage is taking the semantic construct a bit too far.
dl, dd, dt
- A definition list literally indicates a list of terms (
dt) with their accompanying definitions (dd). Practically speaking, it’s reasonable to use the definition list format for any collection of data characterized by paired relationships with one signifying and at least one descriptive. It’s perfectly reasonable to provide multiple definitions to a single term. Frequently asked questions pages are commonly assembled this way.
Table Elements
table
- Oft abused, the table is the best way of organizing and displaying a data matrix. Any kind of two-dimensionally represented data should be organized within a table.
thead
- Defines a header region for a data table, which would normally contain the headers (
th) for each column.
tfoot
- Defines a footer region for a data table, which should include information referential to the columns of data.
tbody
- The content bearing region of a table, but also includes row headers.
caption
- Briefly describes the table. This is essentially a heading for the table.
th
- A heading for either a row or a column, to indicate the type of information within that row or column.
td
- A data cell, in which content is placed which corresponds to both the headers for the row and column.
- Attribute:
scope
- Scope: applied to
th, it indicates whether the heading information applies to a row or a column. It can also be applied to a row group, for tables which have been divided into multiple sections.
- Attribute:
headers
- A much, much, more complicated way of indicating relationships between data cells and their respective headers. Necessary in complex tables where a given data cell may apply to multiple row or column headers. If possible, just avoid creating tables which are that complex…they’re a headache.
- Attribute:
summary
- Applied to the
table element, the summary is a more extensive description of the table, intended to provide non-visual users with the equivalent of a “quick scan” of the table to best understand the purpose it serves.
Separator and “Other” Elements
br
- Generates a line break. The semantics of a line break are a commonly debated point – you can read my views in my article “Is a
br tag semantic?“
hr
- Separates two sections with a visible horizontal line. Although this element conveys no specific semantic meaning which is not conveyed by other elements, it provides the advantage of a visual separator between sections when styles are disabled which is otherwise unavailable. I’m not aware of any advantages for other scenarios.
Discouraged (Presentational) Elements
These elements have not been deprecated; but should generally only be used after careful consideration.
Is it semantic, or is it presentational? This can be a more difficult question than it initially appears. Take b. Presentationally, it renders text as bold. Semantically, it provides no specific emphasis or other specific meaning. Does this mean that it should never be used? Not clearly. Although it’s difficult to describe scenarios in which these elements are useful, if you assume a scenario in which you want bold text but do not want that text to receive additional emphasis, it makes more sense to use b than it does to use span and style it to be bold.
Regardless, these are not elements that should generally be used without careful consideration that they are, in fact, the best choice for the job. But it’s your call.
Deprecated Elements
applet
center
font
dir
isindex
menu
s
strike
u
Not all deprecated elements are created equal. I find it ironic that strike and u are set right alongside font and isindex. Thinking logically, strike and u are very much in the same vein as b and i. Presentational, but perhaps necessary in some contexts.
Nonetheless, there’s no way I’m going to recommend the use of deprecated elements. Find another way!
If you want to see these elements in action, you may find my semantic HTML graphing tool interesting.
I’ve seen a lot of articles discussing the importance of HTML and XHTML semantics. I’ve seen articles describing what it means for a document to be semantic. Most of these articles, however, don’t provide a serious overview of what HTML elements actually may be considered semantic — and what those semantic elements actually mean.
And, even more particularly, why it matters.
Semantics is an erudite area of study. Literally, semantics can be fairly defined as the study of meaning in communication. Communication can readily be extended to cover symbolic notations, representations of language, organization of language, body language and information structures. In developing a web page, we are organizing a means to communicate the content of that page: ideally, we are organizing the page in such a manner that it will be understood regardless of the method by which the page is accessed. It should be equally understandable whether seen, heard, or felt.
The semantics of HTML structure, then, are clearly an important part of web design. Sending mixed signals to the user agent or the user by using a blockquote purely for it’s native indentation is an abuse of semantics: even the visual impact is dependent on the assumption that user agents will consistently render a blockquote in an indented manner.
It’s not precisely an issue that you’ve used a semantic element for presentational means, because, in fact, you’ve done more than that: you’ve presented a block of text which is not quoted material as if it were.
Semantic elements of HTML carry meaning regardless of your knowledge of that meaning. The result is that the misuse of an element creates the potential to mislead or confuse an end-user.
The most obvious examples in common use are those which make use of elements with semantic meaning which also offer a browser-contributed default presentation in order to use that presentational style. The blockquote example above is not uncommon; similarly, the use of empty p elements to create extra white space or heading elements used as a questionable SEO technique in substitution for normal paragraphs.
Other examples which bear mentioning include the use of empty anchor elements to trigger Javascript events — in this case, it’s partially a limitation of the identity of an anchor element, but an empty anchor element should always be considered an error, as it results in a behavior-less anchor if the Javascript is not available.
Now, you may point to the following paragraph, from the HTML 4.01 specifications, as a response to my opinion:
Authors may also create an A element that specifies no anchors, i.e., that doesn’t specify href, name, or id. Values for these attributes may be set at a later time through scripts.
The fact that it is allowed by the specification does not make it a best practice. With all due respect to the W3C, this should not be permitted. For reference, the HTML 5 specification currently reads:
If the a element has no href attribute, then the element is a placeholder for where a link might otherwise have been placed, if it had been relevant.
In addition, although I won’t quote everything, the specification states that an anchor which does have the href attribute must specify a URI as the value of that attribute. It appears to essentially state that an anchor element should have no semantic meaning if the href attribute is not set and valid. But I could be wrong.
The best means to avoid the misuse of elements is to have a clear understanding of when and why a given element should be used in web development. To hopefully expand on your knowledge in that respect, I’m attempting to provide a semantic guide to HTML elements for your reference and rich disagreement.
Be aware, however, that semantics are largely a matter of opinion. It’s not a question of blindly following the guidelines set by a group; it’s a question of interpreting those guidelines to the best of your ability and belief. This guide reflect how I think HTML elements should be used; and I welcome your opinions.
Other HTML Semantics Articles
In October of 2006, I published a brief article about Marcel Salathé’s interesting Java Applet to generate node graphs of web page structure. In that article, I stated:
I’d love to be able to produce graphs where I chose the color coding pattern for particular tags. I could set all non-semantic tags to be bright red, to easily spot the condition of a site in that respect. I could focus my attentions on inline versus block elements, or I could differentiate between different levels of headings.
More recently, I received comments on that post from a visitor who thought my idea to change this was a good one — so, at long last, I’ve gotten around to doing it.
Semantic HTML Graphs
And here’s an example of output:

The graph pictured here is for Metrolinx, the Greater Toronto Transit Authority — and Joe Clark’s failed redesign of the year. It makes for a pretty interesting case study. I know the output is small; but bear with me.
In this graph, you can clearly see long strings of orange nodes, which indicate nested table elements. You can also see significant clusters of bright red nodes, indicating deprecated tags. Altogether, the site is a maze of primarily long wavelength colors. In general, in the color scheme I’ve set up, greater densities of long wavelength colors (red, orange, pink) shows a dependence on tables for layout and presentational elements. Short wavelength color (blue, green) indicate more semantically meaningful structures.
I made a number of small changes to the script which I think add value. First, I added the ability to change the root node you’re mapping. I don’t know that this is incredibly valuable, but it does provide an interesting alternate piece of information. The node switching is limited; it will only check the first node specified of that particular content type.
The second change is to provide a variety of color schemes. The default is pretty complicated, although I drew the line well before attempting to provide a subtly different shade for every single element. I hope that the colors provided at least give you an idea of what you’re looking at, however. The alternate color schemes (two, at the moment) are much simpler: one which simply differentiates between allowed and deprecated elements and another which highlights all inline elements (a, dfn, samp, etc.).
Now, I’ve never programmed in Java before, and although the changes I made to Sala’s source code are relatively slight, it’s highly probable that there are bugs; and I’ve certainly not managed to remove any bugs from the original code.
The last thing I need to mention is concerning the accessibility of this applet. It’s just not accessible. In fact, I know little about how to make Java accessible in the first place; but even so, the entire concept of this applet is highly dependent on color. There can be no question that if you are color-blind or otherwise sight impaired this will be a problem. Additionally, there is absolutely no means present for any screen-reader to understand the input. I do hope to change this at a later date, and author a text-based output which will provide a separate, accessible interface with the information, but that just hasn’t happened yet.
Also worth looking at:
- Validation Graphs – a stand-alone Java application also based on the HTML graph script which spiders pages and checks them for validity.
- Web2DNA – same basic idea, different implementation.
Return to Top