Duplicate content isn't just redundant, but can actually cause your site rankings to suffer and diminish traffic to your website. Irrelevant search results mean nobody is finding your website.
Luckily, you can point Google to your primary content even if your website contains multiple pages of similar or identical content.
Image Credit: Searchmetrics SEO Blog
What is duplicate content?
Google defines duplicate content as "substantive blocks of content within or across domains that either completely match other content or are appreciably similar." Multiple instances of identical content mean Google has a hard time determining which version of your content is most relevant.
Why should you care? Because duplicate content makes it difficult for search engines to:
- Decide which version(s) should be ranked in search results.
- Determine which version(s) should be included/excluded from search indexes.
- Attribute link metrics to one page versus multiple pages.
Image Credit: Moz
Something as simple as printer-only web pages or differences in capitalization within your URL can result in multiple versions of the same content. Consider these examples from SEO expert Moz:
You'll notice that the only difference among these three examples is the capitalization of the "s" in "software" and the "d" in "developer." Although all three URLs are still saying the same thing, search engines look at them as three different versions competing with each other for search engine rankings.
Tips and tricks of the search engine trade
1) 301 redirect
The most common way to combat duplicate content is to redirect from the duplicate version to the original version. You would essentially be combining multiple pages with the same content into a single page that no longer has to compete with all the other versions for search engine rankings.
Image Credit: Moz
2) Canonical URLs
Using a rel=canonical tag as part of the HTML tag of your web page as such...
<link href="http://www.sociallite.ca/blog/" rel="canonical" />
...tells search engines that the page is a copy of the original version, www.sociallite.ca/blog/, and that all links and metrics should feed back to the original page.
3) Duplicate content isn't just restricted to what we see on a web page
If the 301 redirect and canonical URLs haven't confused you enough, consider this: search engines will also pick up on duplicate content in search snippets. Yes, this means that meta titles and meta descriptions aren't safe from duplicate content either! You can, however, use Google Webmaster Tools to detect duplicate content in these areas.
Image Credit: Woorank Blog
4) Multiple formats cause confusion
Some websites have regular, printer-friendly, PDF, and other versions of content, all of which are indexed by search engines as different versions of the same content. For the sake of your rankings, it's better to have only a single version (typically the regular version) showing to search engines.
5) Minimize similar content
This one is a no-brainer: if your website consists of many pages with similar or even identical content, spend some time expanding and/or consolidating each page to group similar content in one place.
Word of caution: don't let the Panda get you
No, we're not talking about the fluffy black-and-white bear; we're talking about Google's penalty for deliberate duplicate content meant to manipulate search engine rankings.
Image Credit: SEO Company
In an attempt to lower the ranking of low-quality sites and ensure that high-quality sites show up near the top of search results, Google introduced a change to its search results ranking algorithm in 2011. It essentially operates like a rating mechanism that will ding you for:
- Signals that users aren't impressed by your site (e.g. low click-through rates, high bounce rates, low time on site, and not very many returning visitors).
- User-unfriendliness (e.g. bad layout and navigation, no mobile version, 404 errors, site speed, an abundance of ads, etc.).
- Trustworthiness (does it seem like users have entered the deep web when they get to your website?).
- There's optimization, and then there's over-optimization (think keyword stuffing).
- Bad content (self-explanatory).