SEO & Technical Auditing Features
In this guide we explain how to access the SEO auditing features in the Raptor web crawler and explain what data we show as well as other features available within this section. The auditing pages provide a summary of the crawl data for any site that has been crawled. As such you will need to have created a project, added and crawled a site to see this data.
This feature helps to provide you with a top-level summary of the website crawl data and is segmented into categories typically addressed in SEO audits. Details of this are expanded upon later in this guide.
The auditing feature does not affect usage and is automatically performed on every site crawled, as part of the crawl process.
Follow the instructions below to access these features.
Step 1: Login
Open a web browser and navigate to:
Once there, login with your Raptor Username and Password:
You can click on the ‘Remember Me’ tick box to save your details for future access on that device.
Then click on ‘Sign In’.
Step 2: Choose a Project
By clicking on the ‘project name’ link in the table shown in the screenshot below you can view a project:
Step 3: Click “Audit”
There are several ways to navigate to the auditing pages, the quickest is to click on the “audit” link from within a project as shown in the screenshot below:
You can also access the auditing pages by clicking on site link shown in the image below, which will take you to the crawl data from the last crawl:
From here you can click the “SEO Audit” button (see below):
The third way to navigate to the SEO audit pages is to click the link in the side menu, you will need to have navigated to the site page to do this:
Step 4: View SEO Auditing Data
Once you have navigated to the auditing page you will be taken to the overview tab, this tab summarises the charts from all the other tabs:
The charts are split into sections with the same names as the tabs at the top of the page:
- Meta Data
The other tabs contain additional data and descriptions useful for any type of technical SEO audit.
The content tab, shown in the screenshot below, provides a breakdown of the text and images on the site being audited:
The first table shows data on the number of HTML pages / URLs that were identified in the crawl. Most websites will have a range of URLs such as images, CSS, JS, redirects and such like. The data shown here is across all HTML pages regardless of their canonical status or indexability.
For example, a site might have 300 HTML pages, but only 100 of them are canonical & indexable, the data on this tab will show the totals across all 300 pages. The ‘totals’ include:
- Word count
- Average number of words per page
- The total number of images (regardless of whether they are indexable. These are unique images, if an image appears on more than one page it will only be counted once)
- Average number of images per page
The second table shows the distribution of pages by the word count. In the screenshot above you can see that there are 255 pages that have between 501 & 1,000 words on them. The chart on the left is a visual representation of this data.
We use a colour scheme to help identify issues:
- Red: Very thin content
- Orange: Borderline thin content
- Green: No issues with the amount of content
This is just a guide rather than a ‘hard a fast’ rule and helps to identify issues. You can download the data from this chart by clicking the download icon highlighted below.
By clicking the burger icon, you will be presented with a range of options for downloading an image of the chart in various formats, making it easy to put it into reports:
Both the download data and image options are available on almost all charts throughout the SEO auditing tabs.
The third table shows the distribution of pages by the number of images they have present on them, unlike the first table, these are not unique images. If an image appears on multiple pages, this will be counted multiple times.
The right-hand chart works in the exact same way as the first chart but instead shows the distribution of pages by the number of images they have present on them.
The indexation tab delineates data pertaining to the indexability of web pages, we look at two areas here:
- NOINDEX tags
- If a URL is disallowed from the robots.txt file
We consider a page to be indexable if it does not have a noindex tag and is not disallowed by the robots.txt file. If either of these conditions are not met, then a page is considered to be non-indexable.
We apply a filter to this tab to only show data for HTML pages, which means that this does not include images, CSS, JS or any other file type. In the downloadable reports, we provide data regarding the indexability of other file types.
It is worth noting that there may be other reasons why a page is not indexable that are not addressed in this summary.
The table on this tab shows the following data:
- The number of pages that have a NOINDEX tag present on them
- The number of pages disallowed from the robots.txt file
- The percentage of pages on the site that are indexable (under the definition given above)
- The number of non-indexable URLs (under the definition given above)
The charts below represent this data visually, with the left chart showing the split between indexable and non-indexable URLs.
The canonicals tab shows multiple sets of canonical data, this is taken from the canonical tags found on HTML web pages during the crawl. A ‘canonical’ page is defined as a HTML page that has a self-referential canonical tag.
The first table shows:
- Number of canonical pages
- Number of Non-canonical pages
- Number of pages missing a canonical tag
- The word count across canonical pages
- The average number of words on canonical pages
The chart on the left and the first table segment the data into three categories:
- Canonical page (as defined above)
- Non-canonical page
- Missing canonical tag
The word count distribution chart shows the number of canonical pages that fall into the buckets of word count. This is the same as on the ‘content’ tab but is filtered to show only canonical pages rather than all pages.
It is worth noting that because some sites do not have canonical tags present, we include pages with no canonical tag as ‘canonical’ within the second table and the chart on the right (see image above).
The last table under the section names ‘canonical duplication’ shows the various types of canonical issue (see screenshot below):
The groupings are all types of canonical duplication:
- www: canonical URLs that use ‘www’
- non-www: canonical URLs that do not use ‘www’
- http: canonical URLs that use ‘http’
- https: canonical URLs that do not use ‘https’
- trailing slash: canonical URLs that use a ‘trailing slash’
- no trailing slash: canonical URLs that do not use a ‘trailing slash’
- uppercase: canonical URLs that use uppercase characters
The data is then represented in the chart below.
The meta data tab shows two types of meta data most used in a SEO audit, the first is meta descriptions. Meta description data is limited to the common error types:
- Missing: The meta description tag is missing (not present) on a page
- Duplicated: The contents of the meta description tag is duplicated across multiple pages
- Too Long: The meta description tag exceeds 160 characters
- Too Short: The meta description tag is less than 100 characters (informational, rather than a true error)
- Multiple: There are multiple meta description tags present on the same page
The second section covers page titles (also known as meta titles) and we include an analysis of the common errors associated with this tag:
- Missing: The page title is missing (not present) on a page
- Duplicated: The contents of the page title are duplicated across multiple pages
- Too Long: The page title exceeds 160 characters
- Too Short: The page title is less than 100 characters (informational, rather than a true error)
- Multiple: There are multiple meta description tags present on the same page
The linking tab provides information about internal and external links found on HTML pages as part of the crawl.
The first section looks at internal linking, which are links that point to a URL on the same domain. The first table shows data only for HTML pages:
- Broken internal links: The total number of links returning a status code other than 200 or 3XX
- Working internal links: The total number of links returning a status code of 200 or 3XX
- Total follow internal links: The total number of *follow links regardless of status code
- Total nofollow internal links: The total number of **nofollow links regardless of status code
- Pages with less than 4 follow links: The number of pages that have less than 4 *follow links
To download a full broken link report, you will need to go to the reporting page, where you can generate and download your report.
The second section shows similar data but for external links, which are links that point to a webpage on another domain.
- Broken external links: The total number of external links returning a status code other than 200 or 3XX
- Working external links: The total number of external links returning a status code of 200 or 3XX
- Total follow external links: The total number of *follow external links regardless of status code
- Total nofollow external links: The total number of **nofollow external links regardless of status code
- Pages with >5 Follow Links: The number of pages that have more than 5 *follow links
- *Follow links are the default link type, they allow authority to be passed through them to the target page.
- **Nofollow links have a piece of code rel=”nofollow” in them that prevents authority from being passed to the target page.
The technical tab illustrates two types of technical data, the first is page speed, represented as load times in seconds. For the page speed section, we only show data for HTML pages.
The first table shows the following data:
- Average load time (seconds) across all HTML pages that were crawled
- Number of status 200 HTML pages (meaning that the pages are accessible)
- Percentage of status 200 HTML pages of all pages crawled
- The number of redirects (includes all types of redirects such as 301 and 302)
- The number of inaccessible URLs (includes all file types, not limited to HTML pages)
The second table shows the distribution of pages by load time, which are grouped into buckets of 1 second. For example, in the screenshot above you can see that all HTML pages load in 1 second or less. These data are visualised in the chart also shown in the screenshot above.
The second section looks at status codes:
The status codes also called response codes show whether a page is accessible or not:
- 200: Accessible
- 3XX: Redirects to another page (not accessible)
- 4XX: Page cannot be found (not accessible)
- 5XX: Server error (not accessible)
- Other: (not accessible)
These are colour coded to help you identify issues:
- Green: No problems
- Blue: Informational
- Red: Problem that needs resolving