The duplicate content and wordpress duplicate page effects SEO considerably. Duplicate content becomes much more of a problem when your blog is growing and people tend to link to different versions of the same blog content leading to ambiguity to search engines.
Let us first see what is duplicate content and how it is formed. Do not forget to check if your blog is having any duplicate content with the duplicate content checker. We can then eliminate the duplicate content by following the ways underneath.
WHAT IS DUPLICATE CONTENT?
Duplicate content is the similar blog content which appears in search engines in form of multiple urls. When the same blog content appears more than once on the Internet it leads to duplicate content. It confuses the search engines as they do not understand which version of the blog post is more appropriate for the search queries and which one to show in the search results thus affecting the ranking of the web page.Example of duplicate content url:
Here both the url points towards the same blog content. But for search engine it is a different url as the later one comes with the id of the blog post.
DOES DUPLICATE CONTENT AFFECT SEO?
Yes it definitely does.
When duplicate content is present, search engine gets confused as which version of the post to rank for. From the multiple url options search engine chooses any particular url to show in search results and the other urls are generally discarded.
We all know that inbound links contribute to ranking factors hugely. So what happens when people start linking to different versions of your post? Naturally the weight-age of incoming links get shared and it drops the search ranking of your original post.
The trust value, authority of post also gets parted when multiple url are present on the web.
HOW DUPLICATE CONTENT ARE FORMED?
Different URL variations
Through various activities on the web, we create our own duplicate content.Session IDs when assigned to visitors to store their brief history, sometimes creates duplicate content.
The secured and non secured url (http/https) can produce two different urls.
If both www and non-www version exists then you should direct Google which version to use, else you are potentially risking at creating duplicate content.
Printer friendliness of CMS also produces different urls for the same post. One which is clean without any ad network for printing purpose and the other for online study.
The URL Parameters
Both direct to the same content and it is same for you and me but search engine takes the URL’s differently and treat it as a complete different set.
Even the orders of the parameters are seen differently by search engines.
ID is the identification of the blog post and cat means category and both the url directs to the same post. But for search engine it is different .
If someone copies your content and does not link to your original content then it becomes more difficult for search engines to rank the articles for the same search queries. This spamming leads to another version of the same article.
It is more common with e-commerce sites as same products are sold with same description and manufactures.
CHECK IF YOU ARE HAVING ANY DUPLICATE CONTENT WITH TOP 5 DUPLICATE CONTENT CHECKER
First and foremost way to check for duplicate content is Google Search Console.
Search Appearance> HTML Improvements
If you have any Duplicate meta descriptions or duplicate title tags it will be displayed here.
This tool will help you to find out if anyone have copied your content. Plagiarism can be easily detected by this tool. This is an extremely important tool to fight spamming when you will find that your blog content is available on multiple domain.
Simply enter your blog post url and you will get to know if any duplicate content of your blog hovering on internet.
This is a very simple tool used to find the internal duplicate content.
Do not panic when you see siteliner is listing your category, tags, archives as duplicate content. You can always de-index tags & categories or find out how to use tags and categories for SEO and building site structure.
Plagiarism is not related with own site duplicate content issues but still it is wise to check. Here you have to copy paste your content. It comes with a limit of 1000 words.
Once you check your content it will give the result as whether your blog lines as unique or plagiarized.
By clicking on the plagiarized section you can find out on which other domains similar lines exist. Sometimes it may happen that your content is original but it is having a striking similarity with some other blog content. In that case if the other content is published earlier then you may do the required changes on your blog content to avoid plagiarism from their end.
If you find that your content has been copied to some other domain then take necessary actions to fight plagiarism.
This tool is very accurate in providing with details as they will search all pdf files, forums, blogs on web to find out from where a content is copied.
With the paid version of copyscape you will be able to check for the duplicate content even for an unpublished post.
HOW TO FIX DUPLICATE CONTENT ISSUES
Duplicate content can be fixed through the following ways.
- Redirecting url
- Canonicalization or
- Choosing between www or non-www version
A. REDIRECT URL
Redirecting URL is perfect when you have multiple duplicate content. You have to setup a 301 redirect from duplicate page to the new original post which you want to rank in search engines.
Cases where 303 redirect is extremely beneficial
-Similar or duplicate content
In many cases we have post & pages with similar topics which are competing with each other for ranking well in search engines. In this case we can combine and redirect all our posts to a single ‘most important one’. This way we can stop all the similar posts competing from each other which will help the potential main post to be popular and rank well in search engines. It will also have an overall positive impact on your blog SEO as there will be no chances of duplicate content.
We will get a 404 error if we delete any post. 404 error is bad for SEO and worse for an user experience. How will you feel if you are searching for a post and see a ERROR 404. Yes it is irritating. In this case also we can redirect the old URL pointing towards the new one thus avoiding any 404 error.
2 ways to Redirect URL
1. Use of Redirection Plugin
This is the easiest way as you do not have to mess up with any coding.
Install and Activate
2. Editing Htaccess file
Follow the simple steps to edit .htaccess file.
Go to CPanel> File Manager
Right Click on Htaccess
Code Edit> Edit
Include the code at the end of the file.
Redirect 301 /the-old-duplicate-post url/ http://yoursite.com/the-new-post url/
B. USE OF CANONICAL TAG
The canonical tag rel=canonical helps webmasters in preventing from duplicate content issues. When a canonical tag is specified in a post it means it is the preferred version of the web page and hence the search engines do not have to think which version to show in search rankings.
So now you know why it is important to set canonical tag. If you do not set then you are risking for a duplicate content in being circulated in search engines.
If the canonical link is set then search engine knows which version is canonical and all the links of different versions are directed to a particular single version thus improving SEO of your site.
Facebook and Twitter honor rel=canonical too. This might lead to weird situations. If you share a URL on Facebook that has a canonical pointing elsewhere, Facebook will share the details from the canonical URL. In fact, if you add a like button on a page that has a canonical pointing elsewhere, it will show the like count for the canonical URL, not for the current URL. Twitter works in the same way~Yoast SEO
How to use Canonical Tag?
My similar blog posts URL
Now there are two versions of same blog content which has been linked from different sites and hence different urls. So which URL search engine is going to rank for? And which one is the duplicate content which will affect SEO negatively?
Here comes canonicalization into play.
<link rel=”canonical” href=”http://example.com/seo/canonical-tag/”>
I choose the cleanest URL. By setting the rel=canonical I have re directed the link B to link A without actually using 301 re-direct.
Here I have merged the 2 pages and now search engines will not be confused which one to rank as both the links now point to single canonical version.
If you are still confused with the canonical tag I would suggest you to have a look in Moz rel=canonical questions.
Setting up canonical tags in Yoast SEO
The advanced Yoast settings allows to set the canonical url.
Yoast SEO> Metabox
With yoast SEO you can simply set canonical url to direct the current page url to something other.
C. SETTING OF WWW OR NON-WWW VERSION
This is yet another method to avoid duplicate content issues.
If you are not choosing between http-https, www or non www versions then you are generating duplicate content issues. People also tend to link to different versions of your site producing more confusions for search engines so it is always wise to direct to only one domain name.
Before choosing between www or no-www versions first make sure which one do you prefer. Sticking to one domain will not only boost SEO but it will be less confusing for yourself too.
A brief about Canonical Domain Name
The preferred domain is the one that you would liked used to index your site’s pages (sometimes this is referred to as the canonical domain). Links may point to your site using both the www and non-www versions of the URL (for instance, http://www.example.com and http://example.com). The preferred domain is the version that you want used for your site in the search results. ~Search Console Help
Why should you use a canonical domain name?
We should have a single version of our site to reduce the number of duplicate issues of indexed pages. When you tell the search engine which version to use then it automatically redirect www to non www or vice-versa to point it to the same url.
If you specify your preferred domain as http://www.example.com and we find a link to your site that is formatted as http://example.com, we follow that link as http://www.example.com instead~Search Console
-Better link building results
Setting your preferred domain effects your domain SEO. If you are not setting your canonical domain name as domain www or domain without www then while creating backlink you are either linking to www .domainname or no www .domainname, so you are linking to different versions of webpages. So ultimately you will be getting half of the results for your link building.
-Duplicate content issue
It is very important to do url canonicalization else you may be facing duplicate content issue from search engines. Your site will get penalized in search page rankings if not setting any preferred domain name.
If you are using word press then it is most likely that word press have already set your default domain and you are not getting penalized in search rankings for non url canonicalizations.
You can tell Google which domain name you prefer by setting your preferred domain in Google search console.
Both www and non-www version implies the same so select the domain style which suits you.
How to tell which domain preference you are on?
It is very easy to check on which domain name your default settings are.
Go to WordPress Dashboard>Settings
You can find your address URL as image below.
Set a preferred Domain using Google Webmaster Tools
1. Login to Google Search Console
2. Click on your site address.
3. Click on the Gear > Site Settings
4. Now set your preferred domain as www or non-www and Save it.
You need to verify your both www and no www domain with Search Console to confirm that you own the both.
This process is same as the when you first verified your domain with Google Webmaster.
This time you just need to Add a Property with the other Domain url.
In my case I should add the non-www version to it and follow the steps to verify domain in Google Webmasters.
Now when you have verified both the versions of domain then you can set your preferred domain.
When your site is growing you will have tiny duplicate content issues which are produced either by you or else linked up from your visitors to different versions of your site. Just keep an eye and check in about a month if your site have produced any duplicate content.