Cookies disclaimer

I agree Our site saves small pieces of text information (cookies) on your device in order to deliver better content and for statistical purposes. You can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings you grant us permission to store that information on your device.

Crawlable Resource Data

Crawlable

Crawlable
Values Returned
Examples of How Crawlable Page Data is Used
- Example 1: Quickly Find Resources That Won’t Be Crawled by Google
- Example 2: Look for Conflicting Indexation Signals to Google
- Example 3: Identify Follow Internal Links to Non-Crawlable Resources
What We Do & What We Give You
Related Content
Benefits of Our Data

 

Crawlable

URLs / resources that are not disallowed by the robots.txt. These resources are not crawlable because the robots.txt has instructed Google specifically to not crawl them.

 

Values Returned

Within the column for ‘crawlable’, the two options or the values we would return in the fields / cells are as follows:

True

Crawlable - If the resource / URL is not disallowed by the robots.txt

False

Non-Crawlable - If the resource / URL is disallowed by the robots.txt

 

Examples of How Crawlable Page Data is Used

There are various reasons why you would want to look at Crawlable Page Data, we have set out some examples below.

 

Example 1: Quickly Find Resources That Won’t Be Crawled by Google

Filtering for ‘False’ within the ‘crawlable’ column will show you are resources that are blocked by the robots.txt file. This can often immediately highlight errors where they exist, the home page should always be accessible for example.

 

Example 2: Look for Conflicting Indexation Signals to Google

Using filters, you can see pages that meet the following criteria:

  • Non-Crawlable + Canonical
  • Crawlable + Noindex Tag Present
  • Non-Crawlable + Listed in XML Sitemap

Using any of these combinations will highlight pages or resources that are potentially not being handled properly from an indexation perspective.

 

Example 3: Identify Follow Internal Links to Non-Crawlable Resources

Non-Crawlable resource (URLs with the value of ‘False’ in the ‘Crawlable’ column) should not have ‘follow’ internal links pointing to them. You can see how many follow links a URL has in the column header ‘follow in link’, a non-crawlable resource should have none.

This leaks authority to pages that do not require it, adding the ‘Nofollow’ tag to all internal links pointing to a resource will resolve this issue.

 

What We Do & What We Give You

This is a basic check, where every resource, whether is a HTML/Text page with a product for sale, or a JavaScript file, can be crawled by Google.

The robots.txt can be the simplest of website files and the most influential, our algorithm checks and informs you of whether a resource is crawlable by Google, based on the instructions of the robots.txt file.

 

Related Content

The list of guides below might be useful if you are analysing this data and want to know more about it:

Related column headers in Raptor website crawler reports:

 

Benefits of Our Data

There are several benefits to analysing indexation data, such as those listed below:

  • Identify Indexation Issues for SEO
  • Identify all Crawlable Resources
  • Identify all non-Crawlable Resources
  • Resolve indexation issues & conflicts
  • Audit a site’s indexation profile
  • Scrape competitor indexation data

 

Sign up Today for a Free 30-Day Trial. Identify indexation Issues on your site!