Cookies disclaimer

I agree Our site saves small pieces of text information (cookies) on your device in order to deliver better content and for statistical purposes. You can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings you grant us permission to store that information on your device.

Cached URLs

Cached URL Duplication

Contents:

 

Duplicate Content & Canonicalisation – Cached URLs

Let’s firstly look at what a cached URL is; if you search in Google for something and go to a listing in the organic results, as the image below shows you can see Google’s cached version of the page.

The issue of cached URLs is quite rare from our experience; this issue is personified by the cached page not matching the actual page. There may be various reasons for this issue such as:

  • The page has only recently updated and Google have not picked this up yet, but will soon
  • The page has not been indexed in a long time, this may be / be a part of a bigger problem
  • Your site is serving different content to Google than they are to users, this could result in a penalty from Google
  • The wrong content / page is showing for the search term

This article is one of several that fall under the duplicate content and canonicalisation series in the Raptor Knowledge Base. Please see the below list for other articles covering all of the different types of duplicate content and canonicalisation issues that a website can experience:

A cached webpage is essentially a picture (or a saved version) of a webpage that has been stored by a web server as a backup or copy of the original. Google cache retains version of WebPages and these can be seen by selecting the drop-down menu that sits next to the URL in the SERPs:

Cached URL Duplication

Viewing a webpage cached by Google gives you an idea of how Google saw that webpage on its most recent crawl, the image below illustrates the metadata present on a cached page:

Cached URL Duplication

You can also see in this view the time and date the page was indexed (& cached), which can be useful if you want to know whether that new content has been picked up by Google yet.
It is possible to see what cached URLs exist within Google’s index by searching for them using the search operator below:

cache:{YOUR_URL}

Actual Example: cache:www.raptor-dmt.com

By checking the URL’s listed against the preferred URL’s of pages may highlight potential duplicate content issues.

Impact of Issue

The impact will depend on the problem, we discussed a few common problems earlier in this guide, the impact of each of these is described below:

The page has only recently updated and Google have not picked this up yet

This is the most common problem and is not a massive issue, if your site does not update content frequently, Google will not come back frequently to check it.

The page has not been indexed in a long time

This could be a bigger problem, because Google may not be regularly checking your site for updates. In this case you will need to check a range of pages to see how recently Google has checked your web pages.

Your site is serving different content to Google than they are to users

This is a major problem and can easily result in a penalty from Google because this is in violation of their guidelines.

The wrong content / page is showing for the search term

This is a major issue if the content is not up to date, preferred or accurate for any reason. This can affect rankings and the user experience.
When a webpage is accessible from multiple URL’s as is the case with cached URLs this can cause a duplicate content problem. These can have an impact on your site’s ability to rank for its target keywords and can reduce overall organic search visibility.

Old CMS

Older CMS’s, particularly eCommerce CMS’s, were huge culprits for producing cached copies of pages.  In an effort to reduce load on the database, the system would generate a static version of the website in alternate directory, in the case of X-Cart, this was the /store/ directory. 

New CMS

Modern CMS’s and publishing platforms may use the term caching to reference the way the store and retrieve version of documents, however these do not necessarily case the duplicate content issue.

How to Resolve

Canonical Tags

This is a topic mentioned in greater detail in another article specific to canonical tags but for the purpose of this article; adding a canonical tag to every page of the site (following the guide above) will prevent most duplicate content issues.

We discussed in the ‘impact’ section of this guide that there are four common reasons for this problem to occur, each will have unique recommendations in addition to the canonical tag recommendations.

The page has only recently updated and Google have not picked this up yet

Resubmit your XML sitemap to Google when you update pages.

Also check that your XML sitemap accurately reflects the frequency at which your content is updated for each page. Read more about this in our handy guide to XML sitemaps.

The page has not been indexed in a long time

Resubmit your XML sitemap to Google when you update pages.
Also check that your XML sitemap accurately reflects the frequency at which your content is updated for each page. Read more about this in our handy guide to XML sitemaps.

Check how recently other pages have been crawled by Google to establish whether this is a site wide or localised problem.

Your site is serving different content to Google than they are to users

This is a major problem and can easily result in a penalty from Google because this is in violation of their guidelines. Consider reviewing the technical reasons why this might occur such as having content hidden by CSS.

You may also want to prevent Google from crawling these pages…

If you do not want this content to be cached / indexed by Google, then there are two ways to remove content from Google’s index:

  • Add a NOINDEX meta Tag to the relevant page
  • Disallow the page from being indexed using the Robots.txt

Both of these options are discussed in greater depth in separate articles on Canonical Tags, Rel Tags, Meta Tags & HTML Code & Robots. For more information on how to implement these solutions please refer to the articles linked above.

The wrong content / page is showing for the search term

You will need to perform a content audit to identify which / how many pages are ranking incorrectly for certain terms.
Once this is done, you will then need to consolidate these pages and content using 301 redirects, canonical tags, robots, etc to limit what Google can index, what it determines to be canonical pages.

Benefit of Resolving

Controlling the URL’s from which your site & pages can be accessed prevents the potential impact from duplicate content issues, by consolidating page value and providing a single reference for your sites content.
We cover the specific benefits of each of resolving the four problems discussed below:

The page has only recently updated and Google have not picked this up yet

Implementing the recommendations in the guide will enable your content to be indexed more frequently, this will have a small positive impact on rankings.

The page has not been indexed in a long time

Implementing the recommendations in the guide will enable your content to be indexed more frequently, this will have a pretty significant positive impact on rankings if Google is systematically not indexing your site for any reason.

Your site is serving different content to Google than they are to users

Implementing the recommendations in the guide will prevent your site from being penalised by Google for violating their guidelines.

The wrong content / page is showing for the search term

Implementing the recommendations in the guide will enable the correct content to be indexed, from our experience when this is done correctly it can have a very strong positive impact on rankings for the specific terms that are identified as having an issue.

Sign Up For Early Access
& Earn a Chance to Win 1 Years Free Subscription!

What You Get...

There's no obligation to become a full member after your trial, but we think that once you've seen what's available, you'll want to join us.

We are in the process of building our software and are ramping up to launch the Technical Auditing component in early 2018, soon to be followed by a suite of other components such as keyword ranking and backlink analysis.

Sign up today for 1 months free access and get a further 10% off of any package price when we launch for the first year as a reward for being an early subscriber.

Also, you will be entered into a lottery, where we will be giving away five 1-year subscriptions for free!

Sign up for early access today!