Examples of How Crawlable Page Data is Used
- Example 1: Quickly Find Resources That Won’t Be Crawled by Google
- Example 2: Look for Conflicting Indexation Signals to Google
- Example 3: Identify Follow Internal Links to Non-Crawlable Resources
What We Do & What We Give You
Benefits of Our Data
URLs / resources that are not disallowed by the robots.txt. These resources are not crawlable because the robots.txt has instructed Google specifically to not crawl them.
Within the column for ‘crawlable’, the two options or the values we would return in the fields / cells are as follows:
Crawlable - If the resource / URL is not disallowed by the robots.txt
Non-Crawlable - If the resource / URL is disallowed by the robots.txt
There are various reasons why you would want to look at Crawlable Page Data, we have set out some examples below.
Filtering for ‘False’ within the ‘crawlable’ column will show you are resources that are blocked by the robots.txt file. This can often immediately highlight errors where they exist, the home page should always be accessible for example.
Using filters, you can see pages that meet the following criteria:
- Non-Crawlable + Canonical
- Crawlable + Noindex Tag Present
- Non-Crawlable + Listed in XML Sitemap
Using any of these combinations will highlight pages or resources that are potentially not being handled properly from an indexation perspective.
Non-Crawlable resource (URLs with the value of ‘False’ in the ‘Crawlable’ column) should not have ‘follow’ internal links pointing to them. You can see how many follow links a URL has in the column header ‘follow in link’, a non-crawlable resource should have none.
This leaks authority to pages that do not require it, adding the ‘Nofollow’ tag to all internal links pointing to a resource will resolve this issue.
The robots.txt can be the simplest of website files and the most influential, our algorithm checks and informs you of whether a resource is crawlable by Google, based on the instructions of the robots.txt file.
The list of guides below might be useful if you are analysing this data and want to know more about it:
Related column headers in Raptor website crawler reports:
- Has Meta Robots Follow
- Has Meta Robots Index
- Has Meta Robots Nofollow
- Has Meta Robots Noindex
There are several benefits to analysing indexation data, such as those listed below:
- Identify Indexation Issues for SEO
- Identify all Crawlable Resources
- Identify all non-Crawlable Resources
- Resolve indexation issues & conflicts
- Audit a site’s indexation profile
- Scrape competitor indexation data
Sign up Today for a Free 30-Day Trial. Identify indexation Issues on your site!