Avoiding Duplicate Content Penalties
May 26th, 2009 by David Bradley >> 5 Comments
Duplicate content issues are not about plagiarism or syndication of your content, they are about the way Google, Microsoft, Yahoo, and other search engines handle your site’s content if they find two or more pages with similar text.
You might think that cannot possibly affect you, that you don’t make copies of your own site’s pages or anything. Think again, lots of content management systems (CMS) and blogging software, such as WordPress, create categories pages, pages that list snippets from pages with specific tags, archives and other such sections. These are all meant to benefit your readers by providing them with ways to get to all your great content even if they don’t have a direct link.
Unfortunately, search engine bots are stupid and simply see duplicated content. Whether or not they actually penalize sites is a moot point, but either way they will deprecate rankings of at least one of those pages with duplicated content. This could mean that your entry in the search engine results page (SERPs) will be cited as a sub-category listing rather than the page for your latest blog post.
Thankfully, the “big” three, G, MS, and Y! have announced their adoption of a solution to this problem, the so-called canonical link element. The element, placed in your sites head section will direct the search bot to the original content source on your site, guiding the bot away from any pages that essentially duplicate the content. This could be particularly beneficial not only for WordPress users but for ecommerce sites where complicated directory type web addresses are generated on the fly from database entries.
For example, your ecommerce site may create URLs that look like this:
http://www.example.com/page.html?sid=asdf314159265
whereas the actual content page is:
http://example.com/page.html
Adding the following to the head section of the page will fix the issue:
<link rel=”canonical” href=”http://example.com/page.html”/>
Google’s Matt Cutts explains things in this video
He also points out that Joost de Valk has produced plugins for WordPress, ecommerce package Magento, and Drupal that take care of adding the canonical link element automatically. Of course, if you aren’t suffering from duplicate content issues with your site and it’s ranking well, then I wouldn’t advise fixing something that ain’t broke. However, if you’re not seeing the pages you hoped to see in the SERPs and your site is not ranking properly, then give it a try, it could be the fix you need.
You should also check your www. versus non-www. and tell Google, via Webmaster Tools, which one of those you are using.
Just for the record, they announced the canonical link on the 200th birthday of Charles Darwin – February 12, 2009.

"Deceived Wisdom: Why What You Thought Was Right Is Wrong" from David Bradley. Available now on 


Leave a comment ↓
geekmom // May 26, 2009 at 8:14 pm
Thanks for the information on copied content. I’m wondering though, what happens if both pages have the header inserted?
David Bradley // May 27, 2009 at 7:59 am
Geekmom, which header? The canonical tag? Why would you add it to both pages the same? Not sure what you’re getting at…
Lasvak // Jun 1, 2009 at 1:14 pm
Google, Live and yahoo had many things (e.g. nofollow rel tag) that they agreed upon. Its a good time for them to start an official regulatory body where we can find outline of developing a site, so that we won’t have any issue with their search engines.
Good info.
Ronny // Jul 3, 2009 at 8:20 am
Thanks for the informative post. The popular search engines often diminish the real value of what they believe to be the duplicate content. Among the other search engines, Google is featured with a system called “footprint” or duplicate content checker that provides more weightage to the original content and includes it in the search engine’s index.
biz // Jul 11, 2009 at 9:09 pm
What is the risk of publish articles on isnare and ezine articles? Should I stop publishing them?