Jill Whalen has a nice article at SearchEngineLand on duplicate content. Although I don’t necessarily have anything to add to the article, I think I can summarize it pretty nicely:

What Search Engines DON’T do about Duplicate Content

Search engine spider to website:

“Hi! It looks like you’ve got two copies of the same document here! Well, humph. I’m not going to index EITHER of them and I’m going to dock your rankings while I’m at it!”

What Search Engines DO about Duplicate Content

Search engine spider to website:

“Hi! It looks like you’ve got two copies of the same document here! Well, it looks like this one is the more original, so I’m just going to try and index it. I might get confused, though, and sometimes try and serve up results for the other page.”

This applies to any duplicated documents — whether it’s multiple addresses for a document on your own site or a document which appears on several different websites. The search engine wants to pick one copy to point people towards, and they’ll try and pick whichever is most original.

The “penalty” that you’ll see is actually the fact that a) some copies of the page may be missing, b) search engines may not pick the version of the page which you want them to, and c) if you’re syndicating material, they might pick a copy on somebody else’s website. The fact that an off-site copy gets picked up ahead of yours isn’t necessarily a bad thing, however, since that copy may be on a site with more authority and still drive traffic and reputation to you.

So, there it is. Duplicate content in a nutshell.