How to Crawl a Project
In this guide we explain how to crawl project that you have previously added to Raptor’s web crawler. A project can contain many websites and so we provide a single button to crawl of them at once.
The number of URLs we must crawl, and the load times of those URLs are the two biggest factors in how long it will take to crawl a project. With that in mind, we typically crawl at around 10 URLs per second per site being crawled.
Step 1: Login
Open a web browser and navigate to:
Once there, login with your Raptor Username and Password:
You can click on the ‘Remember Me’ tick box to save your details for future access on that device.
Then click on ‘Sign In’.
Step 2: Choose a Project
By clicking on the ‘project name’ link in the table shown in the screenshot below you can view a project:
Step 3: Click ‘Crawl All Sites’
By clicking on the ‘Crawl All Sites’ link in the table shown in the screenshot below you can view a crawl an entire project including all sites within it:
Our web crawler is very quick, but you can set-and-forget with our software for large projects. You don’t need to keep logged in to have crawls run.
Step 4: See All Active Crawls
You can see the active crawls by clicking this in the side menu:
Once you have navigated to the ‘active crawls’ page, you will see all your current crawls and their progress:
A Bit More About Crawling, URLs & Usage
You can crawl any number of URLs, in a month, up to and including the number stated as the limit for your pricing plan. Crawling sites uses up URLs until your limit is reached.
URLs are not just limited to standard HTML web pages, they can also include:
- CSS files
- JS Files
- PHP Files
- Video Files
- External links
- Inaccessible Pages & Broken Links
- 4XX error pages
- 5XX error pages
- 301 Redirects
- 302 Redirects
- Canonical duplicates
- www / non-www
- http / https
- with and without trailing slash
- upper and lower case URL characters
You may also be interested in the below guides, which are also in the ‘crawling’ section of our support documentation.