Cookies disclaimer

I agree Our site saves small pieces of text information (cookies) on your device in order to deliver better content and for statistical purposes. You can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings you grant us permission to store that information on your device.

Web Crawler

Otherwise known as a website scraper, web scraper, or website crawler this tool crawls websites and scrapes data from them. Our web crawler (Raptorbot) is cloud based, meaning that it can crawl millions of web pages quickly and efficiently.

We scrape all the SEO components you need to perform a technical or SEO audit of a website or to gather competitive data. With fully customisable web scraping and reporting, we are able to ensure you get all the data you need and in the format you want.

Identify Indexation Issues

Identify and solve indexation problems with Raptor’s web crawler, such as pages that are disallowed by the Robots.txt or by a meta robots tag on the page.

Identify Canonical Issues

Our reports delineate which pages are canonical, non-canonical or are missing a canonical tag. You can also see all canonical tags for each page.

Analyse Meta Data

Scrape and list all meta data such as page titles and meta descriptions along with the character count or length of the component.

Sure-Up XML Sitemaps

Evaluate XML sitemaps to ensure that only the right pages are listed within them. You can also generate XML sitemaps with Raptor.

Improve Site Navigation

Crawl a website whenever you need and immediately identify all broken links, redirects and server errors. Our broken link reports show you where the links are, allowing you to more easily fix them.

Optimise Internal Linking

Evaluate the internal linking structure of your site, identify poorly linked to pages and orphaned pages.

 

Scrape and Report on These SEO Components

Raptorbot is a powerful and robust web crawler, capable of crawling sites of any size quickly and easily. We provide some real-time data during crawls and perform a range of analysis techniques to the gathered data to save you time.
We scrape all on page SEO data that allows you to make better decisions, identify issues and make recommendations.

  • URL – of all pages, images, videos and resources.
  • File Type – Such as HTML, CSS, Jpeg, SWF, etc.
  • Status – The status code returned by a URL, such as 200, 301, etc.
  • Indexable – HTML/Text pages that are not restricted by a robots.txt or meta robots tag from being indexed and have a status code of 200.
  • Non-Indexable – Pages that are not indexable due to robots.txt or meta robots’ tags, or a status code other than 200.
  • Crawlable – Pages and resources that are not disallowed by the robots.txt.
  • Canonical – HTML Pages with a self-referential canonical tag.
  • Non-Canonical – HTML pages with a canonical tag that links to another page / URL.
  • Canonical URL – The URL within the canonical tag.
  • Page Title – The page title or meta title of each page.
  • Page Title (Length) – The number of characters including punctuation and spaces of the page title.
  • Meta Description – This is scraped from every page.
  • Meta Description (Length) – The number of characters including punctuation and spaces of the meta description.
  • Meta Keywords – This is scraped from every page.
  • Meta Keywords (Length) – The number of characters including punctuation and spaces of the meta keywords.
  • Implemented GA Tracking – Whether tracking code is implemented in some form on every HTML page.
  • UA Number (First) – The first UA number (For Google Analytics) identified on each page.
  • UA Number (Second) – The first UA number (For Google Analytics) if present on a page.
  • OG Tags – We scrape all Opengraph Facebook tags on each page.
  • Twitter Cards – We scrape all Twitter Card tags on each page.
  • Google+ Tags – We scrape all Google+ tags on each page.
  • H1 (First) – The first H1 header from each HTML page.
  • H1 (Second) – The second H1 header from each HTML page.
  • H2 (First) – The first H2 header from each HTML page.
  • H2 (Second) – The second H2 header from each HTML page.
  • H2 (Third) – The third H2 header from each HTML page.
  • H2 (Fourth) – The fourth H2 header from each HTML page.
  • H2 (Fifth) – The fifth H2 header from each HTML page.
  • Other H tags – We scrape all header tags on each page.
  • Word Count – Number of words on a HTML page
  • Text Ratio – The ratio of text to code on each HTML page.
  • URL Length – The number of characters in each URL.
  • Page Depth – The depth of a page within the structure of the site.
  • Redirect to – Where a redirect exists, this identifies the URL it redirects too.
  • Linked from XML Sitemap – Yes or No.
  • In links – The number of links pointing to each URL.
  • Unique In links – The number of unique links (one per page) pointing to each URL.
  • Follow In links – The number of ‘follow’ links pointing to each URL.
  • Outlinks – The number of links pointing to other pages on the same domain, for each URL.
  • Unique Outlinks – The number of unique links pointing to other pages on the same domain, for each URL.
  • Follow Outlinks – The number of ‘follow’ links pointing to other pages on the same domain, for each URL.
  • External Links – The number of links pointing to another domain, for each URL.
  • Response Time – Ms (Milliseconds).
  • Size (Kb) – of all pages, images, videos and resources.

 

Product Features

Cloud-Based

The most efficient way to crawl sites of any size is from cloud-based servers. All you need is a web browser and your logins to access and use Raptor!

Scrape All SEO Data

Get all the SEO data you need to perform a range of SEO processes and functions. Export only what you need easily into CSV or XLS formats

Let Us Do Some of the Analysis

We provide you with a range of additional data based on the crawl. This includes whether a page is indexable and canonical.

Multi-Tab Spreadsheets

Delineate all of the data automatically into different tabs of a spreadsheet. No need to filter, copy and paste data to create a beautiful report.

Easy to Use

Raptor is designed to be simple to use, all the complex stuff happens in the background. Adding projects, crawling sites and exporting data couldn't be easier.

Set & Forget

Set a crawl going and come back once you receive an email notification informing you the crawl is complete.

Sign Up For Early Access
& Earn a Chance to Win 1 Years Free Subscription!

Sign up for early access today!