With family and friends having been visiting for over two weeks now, and a weekend trip to boot, it’s been hard to keep up on industry news and interesting articles. Nevertheless, I’ve had plenty to read and pique my interest over the last five days. In fact, almost too much! There’s a good reason to writing frequently in a blog – you’re not left looking at quite as vast an array of interesting subjects that you want to write about. In the last few hours I’ve read or become aware of over a dozen topics or articles which fascinate me – but I still have guests, and not really the time to write on them all!
So I’m picking on a topic which, although perhaps not the most unique, is certainly very close to home – MSN’s search engine ranking technology.
If you want to learn how Google works, at least in generalities, there are dozens of places to look. Just do a search on Google, and you’ll be served up a pretty good variety of information. (Although it’s worth noting that the first result for "How does google work" brings up their April Fool’s Day prank for 2002.) The basics of Google’s search technology are fairly well-known. But MSN is not as open about their methods.
Recently, there’ve been a couple of good articles published about MSN’s RankNet system – a new algorithm incorporated into MSN’s search technology.
Search Engines and Algorithms: Optimizing for MSN’s RankNet Technology, by Jennifer Sullivan Cassidy, discusses the technology and how to optimize for it. Beyond PageRank: Machine Learning for Static Ranking (PDF), a publication from MSN’s research department, addresses the subject in predictably greater depth.
Thankfully for all of us, Bill Slawski summarizes these documents for us. However, he does focus primarily on the principles and subjects of the paper rather than on the practical application to search engine optimization. This leaves some ground still available for discussion!
What is RankNet?
RankNet is, in essence, a "learning machine" that takes the patterns of human searches into account, and learns from them, in order to provide more relevant results the next time around. They start from a baseline of predictions made that are input into its neural net…[snip]…They make their predictions with supervised learning…
So it’s a little complicated. A couple of potentially applicable definitions of neural net include:
- Neural Network
- A type of statistical computer program which classifies large and complex data sets by grouping cases together in a way similar to the human brain. Used in data mining. (Audience Dialogue)
- A computational method for optimizing for a desired property based on previous learning cycles (training). (GenProMag)
- A member of a class of software that is “trained” by presenting it examples of input and the corresponding desired output. For example, the input might be a magnetic anomaly and the required output the depth to the source of that anomaly. Training might be conducted using synthetic data, iterating on the examples until satisfactory depth estimates are obtained. (Geop.itu.edu.tr)
For machine learning:
- Machine Learning
- The ability of a machine to improve its performance based on previous results. (University of Illinois at Champaign-Urbana)
- Subspecialty of artificial intelligence concerned with developing methods for software to learn from experience or extract knowledge from examples in a database. (Ahima.org)
- The ability of a program to learn from experience — that is, to modify its execution on the basis of newly acquired information. In bioinformatics, neural networks and Monte Carlo Markov Chains are well-known examples. (Nature.com)
The implication of this is that Microsoft is incorporating, at some level, a type of artificial intelligence in their search algorithms. This is a very reasonable idea, actually. Where Google bases their results largely on links – that is, on permanent votes for a website’s relevance and validity, MSN is working to apply on-the-fly learning to their ranking. A site may become more important because it is the link a searcher chose to visit.
MSN is incorporating several hundred (569, according to Cassidy) criteria to their ordering system. These criteria are not what is weighed when determining a document’s relevancy, but they are taken into consideration when choosing what properities of a document should be given greater weight in ranking.
RankNet is not a simple concept, but has fairly wide-reaching potential. It can be applied to filters, search engines, or any number of database interaction tools. But what does it mean for search engine optimization? Very little, in my opinion.
The development of more sophisticated algorithms for search engines is very much parallel to the reality of human social interaction. These methods are simply intended to make a website which is important to many people easier to find for other people who may also find the website valuable. As such, a more sophisticated algorithm will respond best to a useful, usable, worthwhile web resource. Search engine optimization is not about optimizing for an algorithm – although this should be considered – it is about optimizing for a human user. If your human social network is successful, then the chances you have for building an algorithmic social network are much higher.
Although RankNet itself may not contribute directly to search engine optimization techniques, it is important to be aware of what factors MSN takes into consideration.
According to Cassidy, MSN looks first at:
- Anchor text in links
- Keyword Density
- URL keywords
- Header tags, title tags, alt tags, and title attributes
- Strong or Bold text
MSN also handles 302 redirects very effectively, places more importance on static pages than on dynamic pages, and has very effective filtering of duplicate content. In fact, MSN apparently has the best handling of duplicate content – and Google the worst.
Taking all of this into consideration may give you a well optimized page. Abusing them may give you a few days of top rankings. But, then again, it might not. If you don’t build your business to attract real customers and real references, it won’t matter how optimized your pages are – just like in any other search engine.