Although the idea of releasing an extensive quantity of unique search data for research purposes is admirable, the privacy issues raised by AOL (America Online)’s unwary release are pretty disconcerting. As has become pretty widely known, AOL released search logs containing the searches of 658,000 users conducted over the course of three months. A fabulous resource for researchers investigating user habits and search marketing; but also an extremely invasive database of personally identifiable search paths and other personal data.
AOL has now removed the database and apologized for the error of judgement which allowed this information to become public, this does little to relieve the concerns for the privacy of those whose searches were released.
Although the usernames have been anonymized and replaced with numeric sequences, these sequences still provide a track for the searches of a single user – which can easily provide everything necessary to make a personal identification. Many people, for example, perform vanity searches – if you observe that somebody has made a large number of searches for a particular name, this may mean they are actually that person. It also may mean that they know this person; or that they are stalking this person. Either way, this is a serious privacy concern!
In fact, some bloggers (prior to AOL’s removal of the dataset) already identified some rather disconcerting query groups. Can any law enforcement body conceivably let this issue go without attempting to gain access to the information? The public availability of information this alarming may greatly weaken the court’s resistance to the Department of Justice’s requests for private data. Although the courts have generally been supportive of privacy, this information could very easily sway a judge.
There’s a lot more information on this issue covered around the web: