Posted by Maulik Patel
October 17, 2018
Duplicate content is one among the top 5 issues faced by any website. It is true that you can’t get rid of this problem entirely but there are certain precautions which you can take to limit this. Firstly, it is essential to understand what exactly falls under the category of duplicate content. Duplicate content either completely matches or is significantly similar to other contents prevailing within or across domains.
These contents can be present in various websites in different URL locations. In this article, you will learn about various duplicate contents and how they hinder your SEO. You will also gain knowledge on what tools you can use to make sure your content is unique and SEO friendly.
1. Scraped content
Scraped content is an unoriginal piece of content which has been copied from another website without consent or permission. Google is not able to tell if an original piece is being copied and re-posted. Thus, there are certain tools available which can allow detecting whether your content has been stolen or posted anywhere else without your permission. With many web monitoring apps available out there, you can search for a scraped version of your content.
2. Syndicated content
Syndicated content is a genuine and authentic way of republishing an older content to a new audience with the consent of the original author. This content is republished on different websites to make it reach to maximum audience. However, while reposting the content the publisher must use the canonical tag to indicate the original source of the article. Failing to do this, the content might become a problem for the SEO.
3. HTTP and HTTPS pages
Identical HTTP and HTTPS are one of the most common duplication problems. This problem arises when the process of switching to HTTPS is not implemented properly. If your website still contains some old protocol or backlinks this issue is very likely to happen.
4. WWW and non-WWW pages
One of the oldest SEO issues is when a site’s WWW and non-WWW both versions are accessible. This problem can be easily fixed by implementing 301 redirects. However, a better option is to specify your domain name in Google search console.
5. Dynamically generated URL parameters
Dynamically generated parameters are used to display slightly different versions of the same page. They are also used to store certain information about the users. These pages contain fairly similar content to previously published web contents.
6. Similar content
As the name itself suggests, similar content is when identical content is re-published. Very similar contents also fall under this category. When there are several pages in a website with separate pages for the same topic and same content this problem can emerge. Rather, one can combine these two pages as one or make unique content for both separate pages.
7. Printer-friendly pages
Printer-friendly versions can be accessed via different separate URLs and it is easy for Google to crawl through these internal links. You can store all your pages in one single directory to ensure that they are printer friendly. Printer-friendly pages are basically used on the internet to describe a version of the web page which are useful while browsing.
8. Duplicated Product Information
Some websites steal product descriptions or product information from other websites that sell similar products. This type of plagiarism is known as duplicate product information.
There are various practices you can indulge in to ensure the originality of the content. Let us understand these various tools in detail.
1. 301 redirect
301 redirect links the duplicate version of any content to the original content. It is great for URL issues leading to duplication. When well-ranked pages are linked to a single one, they are not in competition anymore and they create an overall stronger signal.
2. Rel=canonical
This tag is located in the HTML head section of your page. It works almost like 301 redirects but is easier to set up. You can use it to connect with other websites. This helps will inform the server that the content received is not from you. It creates a search engine friendly URL which enables the SEO to treat it is authentic.
3. NoIndex, NoFollow
This tool is used to exclude a particular page in the search list results. This is done by adding a meta tag which can be added to HTML source code of a web page and this suggests the search engine excludes that particular page from the web results.
4. Preferred domain
This is quite simple to operate. Mainly you need to set a preferred domain for search engines. This will inform whether a site must be displayed under ‘www’ or not in the SERPs. A preferred domain is the one which you would prefer to use as your website’s index.
5. Unique product description
The product information on e-commerce websites can lead to duplicate content issues as many people copy the product’s description and publish on their website. Make sure you take the time to write unique content/ descriptions or enrich your descriptions with something new. This will help you rank above sites whose descriptions are duplicated.
Now that we know how much harm duplicate contents can do to you, it is important to make sure that none of your content is unintentionally duplicate. These tools will help you to ensure that your site’s rankings stay healthy.
1. Duplichecker
This tool allows you to upload almost any document type and run a test which will let you know whether your content is unique or not. You can run one free test before signing up and once you have signed in you can run unlimited tests. Within a few seconds, your scan will be completed. However, the exact time depends on how long the content is.
Pros
a) Very accurate results.
b) Great for SEO, really quick and easy to use.
Cons
a) Very sensitive- sometimes identifies commonly used phrases as well.
2. Siteliner
With Siteliner, you can simply copy and paste your website’s URL in the box and it will scan your entire website for any duplicate content. The results will let you know about the words per page, internal and external links, page load time and much more. You can also download your report file as PDf format.
Pros
a) User-friendly interface
b) Includes every minor detail
Cons
a) You will need to go through page by page to view the results.
3. Plagspotter
This quick, free and easy scanner for web page content will scan your entire website for any duplicate content. Its unique feature also allows you to compare text which has been flagged as duplicate. It offers a myriad of features such as batch searches, plagiarism monitoring, unlimited searches, and full site scans. You can easily sign up for a 7-day free trial and later opt for a paid version which is also very affordable.
Pros
a) Sentence by sentence results
b) Provides a source of matched contents.
Cons
a) Some people might find it tedious to work with.
4. Copyscape
Copyscape is yet another quick free URL search tool. It offers a basic duplicate content analysis for free. Copyscape’s free version will allow you to have unlimited services, search text excerpts, deep searches, and full website searches. All you need to do is copy and paste your offline results and your results will be there within a few minutes.
Pros
a) It is very quick and offers amazing features
b) Provides automated search options
Cons
a) It is available for free but limited to certain pages only.