I’ve talked about vertical search fairly extensively, but I’ve never really looked at one search model which is strikingly different – metasearch. Where vertical search limits their results or index to a specific topical branch, a metasearch engine queries numerous separate search engines to provide results compiled from those search engines.
Now, these ideas are certainly not opposites. There is no reason at all that a metasearch engine can’t search along a vertical scheme, or that a vertical search tool couldn’t query multiple search engines.
Take Clusty, perhaps one of the most prominent metasearch engines. Clusty provides a wide variety of search options, including blog search, web search, image search, and job searching. All of these separate tools are metasearches. Their web search queries Ask, MSN, the Open Directory, Looksmart, Gigablast, and Wisenut. Their blog search queries Blogdigger, Daypop, Feedster, Technorati, Blogpulse, and IceRocket. What can possibly be the advantage to this type of searching?
Metasearch engines have the ability to leverage the relevancy algorithms of other search engines to create a selection of combined search results with maximal relevancy. Instead of having a single (albeit complex) algorithm for search, they retrieve other engines results and identify similarities. If all queried engines provide a particular website somewhere their top 10 results, then this is clearly a highly relevant site. If only 60% of site produced it, it’s a bit less relevant. If only one site did? Probably not that important.
In short, metasearch provides a means to help weed out web spam. Seth Finkelstein provides an interesting anecdote about his own experience comparing results between Clusty and other non-metasearch engines with a particularly spam-loaded query. It’s clear that, at least in some searches, a metasearch engine can provide very high quality results.
A second point of interest which is common to metasearch engines (although not exclusive to them) is the concept of clustering. This practice, which is used by engines such as Clusty and by Kartoo (one of my personal favorites, just for style), allows a search to be further refined according to groups of topics. This is especially useful if you’re searching for a term with significant multiple purposes – such as "apple" (fruit or computer?) or for a common person’s name – "Joe Dolson" (web developer or college wrestler?).
All in all, metasearch provides a great deal of potential for relevant analysis. As it is currently used, (as far as I know), it is very much dependant on the existence of other well-developed search indices, which is somewhat of a disadvantage. However, there is a possible argument that the same technique could be used by a major player in search engines to utilize multiple indices and multiple search algorithms within a single search. Recently, in a Cre8asite Forum post, Bill Slawski mentioned a new
patent awarded to Microsoft which would enable them to easily switch between a variety of ranking algorithms. It seems like a logical extension of that idea to implement multiple simultaneous ranking algorithms and use metasearch techniques to compile this data. The advantage to this approach could be the ability to apply radically, fundamentally disparate search algorithms to find your final results.
There are a lot of good blog posts around on Clusty and other issues of metasearch, so here are links to a few of them:
- Search Engine Watch – Reducing Information Overkill
- Metasearch the blogosphere with Clusty
- About Meta-Search Engines from UC Berkeley
- Metasearch Engine: Wikipedia
In addition, I’d like to provide links to a few of the many metasearch engines out there:
I’m sure there are more, and feel free to mention them in the comments if you have one you’d like added to the list!