June 24, 2008

Spam vs. Accessibility

The whole world of spam is an accessibility nightmare. The concept behind web accessibility is to ensure that users can access the complete functionality of your web site — but how do you cope with the fact that spambots will happily take advantage of any hole you leave?

Comment forms, contact pages, email addresses and enrollment forms. All methods of giving critical access to previously unidentified users — and all in positions where you just need to find that crucial differentiation between real people and robots.

When you’re talking about functionality which is locked behind a log-in form, there’s not really a huge amount of trouble in defining the security/accessibility conundrum. Require a good, secure password and you’re pretty safe. People with disabilities, for the most part, can use a password field just as effectively as anybody else. Once you’re behind that iron curtain, you can usually stop worrying about the distinction: everybody who has access to your private functionality is a known user. They’ve identified themselves, provided credentials which grant them a certain degree of access, and you can stop worrying about them.

But your front door can be a big problem.

You need to create a doorway which will allow visitors you don’t already know to reach you. They need to be able to contact you in order to initiate business, or enroll in your program, or at least create an account with your site. It’s therefore absolutely critical that you create a form which can be accessed by anybody.

But you still only want people using your form. Robot visitors rarely pay the enrollment fee, so they’re not exactly welcome visitors in every area of your site. You certainly don’t want to be thanking them for contacting you with an offer to enlarge your anatomy!

Spam protection and accessibility have inherent conflicts of interest: the formar goal attempts to prevent a form from being used, the latter promotes it. The two goals aren’t actually antipathetic of each other, but getting the two goals to work collaboratively does require a detailed understanding of what the issues are.

Stopping the Robots

One of the most common solutions to the spam problem is to prevent a problem which a computer can’t solve. The most obvious solutions (pictures of animals, pictures of people, etc.) are inherently flawed because they require specific pieces of information in order to solve. They’ll require correct spelling in the correct language with knowledge of the subject depicted. Although most visitors may be able to identify an elephant, some visitors will inevitably (and correctly_ identify it as an elefant.

Presumed knowledge is a barrier to both humans and computers.

This is what has led to the numerous garishly blurred and colored text images you’ve undoubtedly had to interpret. Computers can use character recognition to examine images and identify the text, so the presentation is warped to decrease the likelihood of recognition. Of course, this also decreases the likelihood that humans will be able to read the image. Humans with disabilities? No chance. Either you include an alt attribute, making the solution trivial for a computer, or you leave it out — making the solution impossible for somebody with a visual disability.

Thus was born the audio CAPTCHA. However, audio CAPTCHA requires specific technology — an audio format must be chosen, and an audio player provided. Additionally, computers are capable of recognizing audio excerpts in much the same way they can recognize images. As a result, the audio output is distorted. I’ve listened to audio CAPTCHAs, and all I can say is that I hope others have better luck than I do. I’ve never passed one.

And, of course, neither of these methods will provide access for anybody who is both hearing and visually impaired.

There are numerous other examples of attempts at accessible CAPTCHAs. Most of them depend on the fact that while robots may be text-aware, they are not necessarily capable of following instructions provided in text. Simple question & answer bot-blocking techniques like:

  1. Write “human” in the field below.
  2. What is 3 + 4?
  3. Is fire hot or cold?

These simple questions can slow spam — these can be considered generic spam prevention methods. They will stop almost all spam which is not specifically targeted at the form. However, if any programmer decides that they want to write a bot to attack your site, it is a trivial problem. Simply put, these kinds of questions generate security through obscurity.

A second class of bot-blocking techniques are found in more complex question & answer sets:

  1. Write “red” in the 2nd text field on the left.
  2. Enter your name in the 3rd row, 2nd column.

These programmatically variable questions may also slow a bot, but can also be incredibly challenging — if not impossible — for a human visitor who is not using an visual browser with an output equivalent to the instructions.

Tricking the Robots

Now, robots aren’t terribly intelligent. Usually, their decision making skills are fairly limited. As such, it’s not terribly difficult to simply deceive them. These methods may have some effectiveness at slowing down bots:

  1. Required selections on option menus. Not that a specific option is required — just anything available in the menu.
  2. Honeypots — fields which should not be filled in, but probably will be by your average bot in it’s quest to cover all it’s options.
  3. Limited length fields — if you set this client-side, using the HTML maxlength attribute, a bot can easily limit it’s own input. However, if you set it server-side (at a safe margin for real users) you can stop a few bots which get over-eager.

Mike Cherim has valuable tips on these techniques in his article Protecting Forms from Spam ‘Bots, so I’m not going to elaborate on these points excessively. Again, however, these are all valuable methods within the “security through obscurity” school of protection — no serious protection against a motivated spammer.

Mike’s secure and accessible contact form makes use of a wide variety of techniques and provides thorough accessibility, so if you’re looking for a simple contact form which will block generic spam, it’s a great option.

Behavior Detection

This is a complicated area, which I’m not going to delve into in any significant detail. Primarily because I’m not really qualified. However, it’s an important category of spam control, so it’s worth an overview.

The principle of behavior detection is based on one core observation: bots don’t behave like people. People are, for the most part, a complex blend of random behavior and systematic exploration. Bots are generally much more absolute. When you observe a web site “user” visit every single navigable page of your site at 30 second intervals, that user is clearly not human.

Although the actual interpretation is significantly more complicated, the challenge is simple: look for patterns. If a user’s time on a site matches a mathematical pattern, that’s a signal. The Bad Behavior package works (at least partially) on this general logic: search for indications about the user or user-agent and identify signals which suggest non-human activity.

Requiring Specific Capabilities

Some spam solutions make the choice that they will require specific capabilities from the visitor in order to allow them to make contact. The Wordpress comment spam plugin WP-Spamfree takes this strategy. The first layer of protection for this plugin is to require that any visitor trying to submit a comment have support for Javascript and for cookies enabled.

Immediately, this strategy eliminates the vast majority of bots — and a small minority of humans.

Conclusion

I’m not aware that there’s any solution which has 100% success at differentiating humans from bots. Any barrier put in place to spam will also create a barrier for somebody. However, this is a decision that must be made for any site: when you’re receiving thousands of spam messages a day through an insecure contact form, is it better to stop the occasional human or massively reduce your daily spam-killing time commitment?

Ultimately, there isn’t a real answer. Spam is too great of an issue to simply ignore. However, any time you create a CAPTCHA — of any sort — just remember this: provide an alternative. If you provide a phone number to those who have failed your little test, they may be able to reach you. If somebody needs to reach you, make it possible: even if they’ll have to write you a letter in order to post a comment on your blog.

Comments (17)

Filed under Accessibility, Usability by Joe Dolson

March 29, 2008

What is “Cross-browser compatibility?”

Here’s the first clue: it’s not creating a pixel-perfect replication of your ideal version of a site in all browsers.

In fact, cross-browser compatibility ultimately has very little to do with what a web site looks like, and a lot more to do with how it functions. It also has relatively little to do with browsers, and perhaps could better be explained as multiple user-agent compatibility.

“Compatibility” (in this context) is not a term which means “looks and behaves identically” — - instead, it may be better described as “performs equivalently under alternative conditions.” But developers and designers tend to most immediately seize upon appearance as the guiding line for cross-browser compatibility.

Of course, let’s be honest: there are a lot of very good reasons for this. Completely disregarding what we may know about the behavior of a site, clients tend to be very visually oriented. They POP their new site open at home one day during development and notice a whole variety of differences which they’re suddenly concerned about. If you’re lucky, they’re opening up Internet Explorer 6 after you’ve gone through the painstaking process of correct its inability to cope with standards-compliant code, rather than before you’ve gotten around to it. That can be awkward…

Another good reason is that despite what I’ve stated above, making the design behave more-or-less identically between different browsers is actually quite desirable. From a usability perspective, a seamless change in interactivity between different user-agents is very desirable. If you’ve ever tried to guide somebody through using a website which delivers a different experience to their browser than to yours, you are intimately familiar with one reason it’s a very bad idea.

But the absolute key to cross-browser compatibility is simply functionality. A lack of cross-browser compatibility doesn’t mean that something looks different; it means that it doesn’t work.

And a good thing, too. Otherwise, compatibility would be pretty well impossible between desktop browsers and mobile browsers. ;)

With web design, it’s occasionally entirely possible to make two browsers render a design exactly the same…if you assume certain factors will remain constant, such as the user settings described in my last post. If any of those have been changed, everything pretty well goes out the window. As desirable as it is to make your designs look as similar as possible between the various desktop browsers, it always has to be acknowledged that there are limits.

There’s nothing at all that you can do to actually guarantee the same view for everybody; instead, you need to guarantee an equivalent view for everybody. Equivalent in that they will be able to get the same information and use the functions of the site to perform the same actions.

Comments (5)

Filed under Browsers, Usability, Web Development by Joe Dolson

January 12, 2008

Usability and Trust

Without both, it’s very difficult to have a successful online business. Unusable web sites have an incredible ability to generate a lack of trust in the business — - as soon as one feature fails to work correctly, or doesn’t behave as you expect, there’s an immediate connection made:

“If they can’t get this right, what else might they have problems with?”

Will they lose your financial data? Will they ship you the right product? Will they bill you the right amount of shipping? What are they going to do with your private information?

It’s hard to fully trust a website which gets in your way when you’re trying to perform basic tasks. The above questions may come up as reactions to pretty severe site problems, such as incorrect product data or frightening error messages, such as this one:

Okay, I've gone to read my mail 2x and gotten a frightening red box that says: You cannot do that. This action is being recorded.

“You cannot do that. This action is being recorded.”

Yikes! Not really an ideal situation. Now, having written error messages before, I can imagine what was meant, which might be better stated like this:

“You may not perform that action. We have logged the error and will work to take care of any problems!”

There are a couple of important differences between those statements.

First, there’s the tense of the statement: we are currently recording vs. we have recorded. The first leaves an ongoing implication that your actions are being monitored which may be a bit disturbing.

Second, we have the indication of what has been recorded. In the first case, it sounds like the system is recording your actions. The second message clearly states that the information recorded was the error which occurred, and assures you that the problem will be worked on.

Maintaining trust in your application depends on good data, clear and non-threatening error messages, and clear task pathways. If your task paths aren’t clear, you may lose users due to sheer confusion. If you aren’t checking your data and perfecting your error messages (and all other responses, of course!) you may lose the visitors trust that you’ve really got their needs in mind.

Comments (2)

Filed under Usability, Web Development by Joe Dolson

January 10, 2008

Usability Issues with Domain Management

Working as a web developer, I find myself dealing with a lot of different domain registrars, hosting services, etc. It’s inevitable. It’s also not the slightest bit uncommon to run across one very specific usability inconvenience with how these services manage their services. (Not all of them — - but enough that it’s irritating.)

This specific problem is that when you’re managing domains, some of these services handle multiple-domain management in the following manner:

  1. Select the action you wish to perform.
  2. Select the domain you wish to change.
  3. Rinse and repeat.

It should be readily apparent what the problem is: choosing the action prior to choosing the domain is an extremely ineffective way of making a large number of changes to a specific domain.

Now, the way I tend to work (and I don’t see any great likelihood that this will change) is to focus on a particular site and do everything I need to do on that site in one working session.

End result: if I need to make, say, five changes to a domain, I need to take 10 individual actions. If I selected the domain, and then performed a variety of actions on that domain, I could easily reduce this to only 6 actions.

Even if I needed to work a different way, such as making the same change to a large number of domains, this continues not to be an efficient way of making the same change on a large number of domains, which would be best handled by allowing selection of multiple domains for simultaneous changes.

At any rate, if you happen to be a large company which manages hosting and/or registration of domains, don’t set up your management interface like this. It’s annoying.

End rant.

Comments (2)

Filed under Usability, Web Development by Joe Dolson

December 18, 2007

Following User Navigation Paths

An interesting thread at Cre8asiteforums, titled “When lots of your visitors go straight to search? discusses a member’s curiosity about navigation patterns after noticing that a significant percentage of his visitors — - 25% — - go directly to search after arriving at his site.

It’s an interesting element of site navigation to investigate, and the thread raises a pretty significant number of additional questions worth analyzing.

The navigation path of any given user will be fundamentally unique. However, when taken in aggregate, navigation paths begin to suggest a lot about your navigation structures. The percentage of visitors who immediately jump into a site search, however, suggests a very different thought process.

On sites which I visit frequently, for example, I generally have a very set system for finding information. If the site has a good search, then I may use the search. If they have a very clear navigation, I may use that, instead. If they have neither, it’s unlikely that I visit the site frequently…but if I do, I generally have separate bookmarks to the individual features which I actually use.

And that raises a separate question — - if a site has difficult navigation and inferior search, what would drive you to actually visit it frequently? For me, the site has to offer some specific information or functionality that I simply can’t get elsewhere. Otherwise, there’s simply no justification for the challenge of using the site.

You can learn a lot about the effectiveness of site navigation by following analytics data. Knowing that most users who use your search feature fail to find what they’re looking for, for example, should suggest that this is a feature of your site which needs work. Finding that users frequently enter several sections of your site before finding the right information can be significant as well — - it suggests that you need to rethink the way your site categories/sections are organized.

So…important question, then: HOW do you follow this? Where do you get this information?

You’re not going to get meaningful user information from standard statistics packages like AWStats or Webalizer. You need to use some tool which will provide a means to track the path of specific users. This can be parsed from your server logs. A high-end statistics package such as Clicktracks will give you user path data. There are a number of other services which can provide this information (and, if you know what you’re doing, you’ve already got the information).

I’m not really an expert on analytics packages, of course. If you want a lot of detailed information about web analytics and analytics packages, here are a few resources:

Comments (3)

Filed under Usability, Web Development by Joe Dolson

December 5, 2007

Thoughts about Content Labeling and Data

An interesting thought in indexing and handling page structure is the concept that different areas of a single page can be identified and considered independently from surrounding bodies of content. This particularly applies to specific and readily identifiable data-types, such as phone numbers, postal codes, or abbreviations; but can also be extended to include broader content labeling.

A well-structured XML document has an absolutely clear labeling system for data built into the structure. If you take any RSS feed, for example, the elements which identify <title>, <link> or <managingEditor> can’t readily be mistaken.

A well-structured, semantically sensible XHTML or HTML document doesn’t offer nearly the same degree of data particulation — - the higher level data elements can sometimes be fairly clear, as is the case with <address> or <cite> elements, but other potentially valuable elements end up providing relatively neutral value: <h2> or <div>.

Read more: Thoughts about Content Labeling and Data

Comments (0)

Filed under Accessibility, Semantics, Usability by Joe Dolson

November 29, 2007

On the usability of contextual URLs

 Example:

Visit this site! http://www.joedolson.com/

I run into this, or into something like it all the time, and it’s pretty understandable why. Obviously, if you don’t know how to create a hyperlink, or if you’re working with a CMS which will automatically convert a URL into a hyperlink, this is the most reliable way to provide access to somebody else’s site.

Either they have the URL, and can use it “straight up” if they know how, or they can follow the hyperlink generated by the system. Nice and easy. I understand perfectly well why an inexperienced content manager might make use of hyperlinks au naturelle, or so to speak.

Read more: On the usability of contextual URLs

Comments (10)

Filed under Accessibility, Usability by Joe Dolson

Return to Top