Semantic formatting for interviews

January 8, 2007

Topics: Semantics, Web standards.

This is a question which came up once upon a time at Cre8asite Forums, and it intrigued me. An interview, next to a standard article format, is possibly the most common type of written document on the web. Interviews pretty much always follow a very straightforward sequence: Question and Answer. An interview may have one interviewer with one interviewee, or may have several of each. In either case, the tag should be able to support some form of citation to specify the currently speaking party.

There is no perfect semantic solution for formatting an interview, I’m afraid.

Some of the most commonly used choices might include heading tags and paragraphs, definition lists, or simple sequences of paragraphs. None of these really support the characteristics necessary for an ideal interview.

So we need to fake it; or create something new. I don’t really want to get into what’s required to add custom elements to a custom DTD. Sure, you can create your own valid XHTML document type declaration and create an interview element – but I don’t think that’s really the best solution. What value does a beautiful semantic element have if no user agent will identify it?

Perhaps a microformat would be a good idea: and, in fact, it has been discussed, although the idea hasn’t gotten very far. The chief direction it’s tended to take has been using implementation through a definition list. Lacking anything better, that’s probably the best choice. I’m not up to writing a microformat today, I’m afraid, but I’m going to force you all to suffer through my contemplation of how an interview might best be organized anyhow.

A definition list consists of three parts: the list block <dl>, a "term", <dt>, which is a block of text which is definitely associated with <dd>, it’s "definition". The main reason that this format is flawed for an interview or for dialogue, in my mind, is in the lack of any semantic means to identify the speaker.

An interview might be written like this:


<dl>
<dt>What do you think about questions?</dt>
<dd>Well, I'm not really interested.</dd>
<dd>Neither am I, really.</dd>
<dd>That's interesting!</dd>
</dl>

You can see that it’s very difficult to say who’s speaking. If you were to assume that any given interview only involved two people and that they exchanged a single question for one or more answers, the definition list works very effectively. However, when you consider the possibility of multiple individuals being interviewed and responses from the interviewer which are not questions, things may get a little complicated…

It’s a simple matter to label the speakers in plain HTML text, to make it clear who is speaking, of course. That’s really not the point, however – when I’m talking about semantic formatting, I’m specifically trying to get at the most machine-readable manner of identifying a format. XML accomplishes this quite easily:


<interview>
<question num="1">
<interviewer type="question" name="Joe Dolson">What do you think about questions?</interviewer>
<responder type="response" name="Evil Twin">Well, I'm not really interested.</responder>
<responder type="response" name="Split Personality">Neither am I, really.</responder>
<interviewer type="response" name="Joe Dolson">That's interesting!</interviewer>
</question>
</interview>

Here, specification is provided about what sequence the questions are coming in, who is speaking, whether they are asking a question or responding to one, and could easily provide even more information.

Sadly, HTML does not provide the ability to define your own attributes quite so easily! If you were to write your interviews out in XML and then use XSLT to transform the text into HTML on the fly you could leave the information accessible to any device with the ability to process XML while providing a clean layout for other agents – at least, for agents which support XSLT.

What it comes right down to is that, in my opinion, the lack of any format which can fully support the semantic needs of interview or dialogue means that you shouldn’t spend too much effort worrying about how to write them out. I certainly wouldn’t recommend just using paragraphs: some manner of differentiating questions and answers is still pretty important, but the level of detail in the HTML and XHTML specifications don’t provide anything which is really suitable.

Normal use of headings and paragraphs may well be sufficient – a definition list could be better for some interviews, but for others could simply be less effective.

4 Comments to “Semantic formatting for interviews”

  1. I think definition lists are fine for simple interviews – it’s when it comes to multi-person conversations that I think they’re weak. They don’t really support the threading that’s necessary for more complicated interviews.

    I’ll admit, though, that this article could be better written. I missed some obvious possibilities and didn’t always make my point that clear…

  2. I have run a couple of interviews and used dl/dt/dd there. It seems quite reasonable and semantically correct. If you have read the DL specification, you’d see that its use is fairly broad and is not limited to simple term/def lists.

  3. Hmmm. That’s a damn good point. For some reason I never think of nesting block level elements – tend to forget you can do that.

    I think I’ll revise this article, actually…

  4. Curious, but wouldn’t employing blockquotes with a leading cite within the answering DD be even better?