The JS scripts that are downloaded with the first request are executed on the clients own machine and hence these interactions load much more quickly. JS can also be used to send remote requests to return just the data required to dynamically update a small portion of content on a page via AJAX requests. Once again, this speeds up the load time of a page as a smaller amount of data is sent with the request.
This gives developers the opportunity to produce web pages that are fast, dynamic and very user friendly. For example on the low end, clicking a button or selecting an option from a drop-down menu could change the content on a page, the data in a table or anything you want. On the high end, it allows single page applications to be developed so that a web app can feel like a desktop app running locally. This is great, as it can help to improve the user experience and delivers a level of interaction that is difficult to impossible to achieve with just HTML.
Rendering JS is more computationally expensive that just downloading and parsing plain HTML and hence web crawlers often charge a premium for this. Therefore, it’s not a good idea to just leave this option on at all times. Next, we will discuss how to check if you need this turned on for a particular site.
The easiest way to determine if a site requires JS rendering, which is performed by your browser, is to disable JS within your web browser. This will vary depending on the browser, in most cases you simply need to navigate to the settings for your web browser and find the switch to turn off JS. Once you have disabled JS in your web browser reload / load the site you want to check and see what happens.
Try navigating the site if it renders at all and see if the links work, if you can see images, text and check to see if the functionality still works. If you can still see and use the site as you would with JS enabled, you can assume that the site doesn’t use JS for the most part. There may be aspects of the site that use JS such as tracking code, AdSense and other such components. There may even be some pages or directories that use JS but not others, so this method is good to a point but falls over on massive sites.
Server-Side Vs Client-Side Rendering
If Google or other web crawlers are unable to render the JS they may not be able to read the content or find and follow the links to other content. If Google cannot see the content, they are unlikely to rank the page in the search results. If Google cannot find and follow links, they will not be able to find the content on your site. In either case, there is the strong potential to encounter indexation problems.
Hence, using an SEO tool that enables you to crawl and scrape website data even on a site that requires JS rendering is essential for most SEOs.
Google will not do certain things such as render mouseover JS and scrolling, consequently there are a range of JS solutions that will result in components of a page not being indexed. Google recommends that you use a hybrid of HTML and JS to build sites, pure JS sites may still have indexation issues. Some other things to consider are:
Google don’t click around your site in the way a user does, consequently, loading additional events after the render (click, hover and scrolling for example) are not going to be rendered
All the page resources (such as JS, CSS, images, etc) need to be available in order to be crawled, rendered and then indexed
The rendered page snapshot is estimated to be taken at around 5 seconds, although there may be some flexibility within this. If a page takes too long to render (typically longer than 5 seconds), there is a risk that some elements won’t be seen, rendered and consequently indexed
Images that use LazyLoad will not be indexed as this requires scrolling
Google’s rendering is performed separately to the indexing process. Initially Google crawls the static HTML of a website while deferring the rendering until it has resources available to do so. Only after rendering will Google discover further content and links available and this can take days to a week
WRS & PRS
A Web Rendering Service (WRS) and Page Rendering Services (PRS) are what Google refer to when they are rendering website content.
We use the latest stable Chrome Driver extension to render website content, which is exactly how Chrome users will render websites and their content. Google tackles this task in the same way and as such, you should get similar to identical results using Raptor as you would if you used Googlebot.
Because this process is more computationally heavy, we remove 4 URLs from your monthly limit for each URL crawled. This means that crawling a site with 10K URLs will remove 40K URLs from the limit, which is refreshed / replenished at the beginning of each new monthly billing cycle. Unlike some of our competitors, we do not withhold this feature from cheaper subscription plans!
To limit the usage, you can turn off the crawling of other file types such as images and CSS files, or you can specify where the crawler will go by limiting the crawl to just the directory / sub-directory specified. It is also possible to prevent the crawler from crawling subdomains to save on your URL usage.
You can check how many URLs you have used and have remaining by going to the usage page within the software. From here you can see where you are using URLs, how frequently, how many you have left, and when they will renew.
Much like Google, we will not scroll down a page when rendering page components, any components that require scrolling will not be seen, rendered or crawled.
We do not render mouseover JS components or clickable components that load additional content on the same page (URL).