There usually is a raging controversy amongst webmasters as to how Google and other search engines view and treat duplicate content issues.
Before we reveal the result of our in depth research, we must place “duplicate content” in proper perspective.
What is duplicate content?
Duplicate content is more or less identical content appearing on the same or different sites.
The definition above almost immediately throws up the fact that duplicate content is primarily of two types:
- More or less identical content appearing on the same site
- More or less identical content appearing on different sites
More or less identical content appearing on the same site
Google classifies these into two types
1. Duplicate content with malicious intent or deceptive in origin, on the same site.
2. Unintentional Duplicate content without any deceptive intent, on the same site.
This unintentionally occurs in some instances, for example
- Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
- Store items shown or linked via multiple distinct URLs
- Printer-only versions of web pages
What is Google and other search engines view and treatment of the scenario where more or less identical content appears on the same site?
Our in depth research has revealed the following:
Where as in the first scenario, the duplicate content is premeditated and with malicious intent or deceptive in origin, then Google frowns at this and will take steps to sanction such erring sites as their action constitutes a violation of Google’s webmaster guidelines.
Such sanctions may include a complete removal from Google index.
Where on the other hand and as in the second scenario, the duplicate content arises unintentionally and without malicious intent, Google will not penalize such webmasters but rather take steps to index only one of the duplicated web pages it considers as ideal for such content.
The site content will therefore not be placed in the supplementary listing as often touted, due to duplicate content.
Duplication as opposed to duplicate content may however indirectly influence this, if links to the webmaster’s pages are split among the various versions, causing lower per-page PageRank.
Webmasters are therefore advised to proactively take steps to address duplicate content issues on their websites and ensure that visitors see the content they want them to.
They can take the following steps to achieve this:
- Use 301 redirects – If for example you have restructured your site, you can use 301 redirect (permanent) to redirect visitors and spiders to the updated content.
- Use webmaster tools to indicate your preferred domain to Google and do the same to other search engines.
- Minimize similar content – If for example you have similar content on different web pages of your site, you can expand one reasonably, to distinct it from the other.
- Be consistent in your internal linking structure – Once you pick a particular format of writing URLs, then stick consistently to this.
- Include the preferred version of your URLs in your Sitemap file.
- Understanding your content management system – For example, a blog may have the same content on the home page, a permalink page, a category page, and an archive page
The following measures or steps to achieve this objective of proactively addressing duplicate content issues are however not recommended by Google
- Blocking crawler access to duplicate content on your website.
Whether with a robots.txt file or other methods, since without search engine bots being able to crawl these pages they cannot identify them as duplicates and will have to treat them as separate unique pages.
A more acceptable solution is to allow search engine bots to crawl these URLs, but mark them as duplicates by using the rel=”canonical” link element, the URL parameter handling tool, or 301 redirects.
Where duplicate content leads to crawling too much of the webmasters’ site, he can also adjust the crawl rate setting in Webmaster Tools.
More or less identical content appearing on different sites.
Again, this can be of two types
1. With malicious intent or deceptive from the origin
Again, as indicated above, such constitutes a violation of Google webmaster guidelines and Google will take necessary steps once notified, to sanction such erring webmasters.
This for example can apply to scrapers (misappropriating and republishing) of your site content.
With measures put in place by Google, they are unlikely to affect the originating webmasters’ site rankings. However, where he particularly feels frustrated by the actions of such scrapers, he is at liberty to file a DMCA request to claim ownership of the content and request removal of the other site from Google’s index.
2. In line with good practice e.g syndicated content
In this case, there is no contravention of Google webmaster guidelines and so no penalty results.
Rather, each webmaster threatened by such duplicate content, is advised to take necessary proactive steps to safeguard their interests by doing the following, identified in top search engine placement
**** If you enjoyed reading this post, be sure to fill out this form to receive notification via e-mail once any new blog post is published. You will be able to see the post title and if it piques your interest, you can simply click over to my blog.
You can leave a comment below, picking up one or two dofollow backlinks in the process, as the case may be, since this blog is dofollow and has keywordluv and commentluv plugins enabled.
Do not forget to share this post with your friends and followers. Remember sharing is caring! ****