No. of URLs Crawled - Raptor SEO Data
The number of URLs crawled is the total number of URLs that were actually crawled during the crawl of your site. This includes all file types such as HTML pages, images, CSS files, etc. This also includes URLs that return a response code of 3XX, 4XX, 5XX or any other response code.
If a crawl was paused or interrupted this number will not match the number of URLs found during a crawl.
What This Data Means
When you set a crawl going I our web crawler, we first look at all of the URLs listed in any xml sitemaps. These are added to the ‘crawl queue’, once the crawl begins, we start crawling these URLs and any new URLs that are found are added to the crawl queue. Broadly speaking this is the most basic process of how our web crawler works.
Once a URL is crawled it is tagged or marked in a database that details it as having been crawled. Once a URL is crawled, we will not crawl it again to avoid a range of web crawling problems like infinite crawling.
What Are URLs
URL is an abbreviation for ‘uniform (or universal) resource locator’ and is the web address of any web-based resource such as:
- HTML webpages
- CSS files
A URL crawled can be any of these file type and many more. Some URLs are accessible while others are not, accessible URLs are URLs that have a status or response code of ‘200’. There are hundreds for response codes that would mean a URL is not accessible for any number of reasons. Our web crawler still needs to try and crawl these inaccessible URLs.
Therefore, all URLS regardless of whether they are accessible or not will be counted as ‘crawled’ is they have been crawled by Raptor.
No. of URLs Found
This number is the context for the ‘no. of URLs crawled’ as you would typically expect these to match, if we add 100 URLs to a queue, we would normally crawl all 100 URLs. As explained above a URL found is one that has been identified in a sitemap or a part of the website crawling process.
Why These Numbers Might Not Match
Thee two metrics will only not match under the following circumstances:
- If the crawl was paused of stopped by the user
- If your account limit for URLs crawled per month is reached
- In the unlikely event of some kind of error or site maintenance issue that interrupted the crawl
Understanding the data provided by your SEO tools is vital, some programs use different name or acronyms, other sue proprietary metrics! We aim to be totally consistent with the naming conventions we use and provide complete transparency by describing in detail all of the data we use and show. At Raptor our SEO tools and digital marketing software is designed for SEOs by SEOs, we give you everything you need to make informed strategic decisions and drive organic visibility. If you want to know more, check out the SEO explainer video shown below:
This guide is part of an extensive series of guides covering the data that we show in the summary tab of our SEO reporting feature. The following list of links shows all of the categories of data guides, videos and tutorials that we have. If you have any feedback on this or anything else, please fee free to get in touch:
- Canonical Content
- Content Data
- Linking Data
- Page Speed Data
- Meta Data
- Google Analytics Data