Step 2.34: Crawlable Pages
Basic Concepts | Indexation
Direct Ranking Factor: Critical
You can determine what pages, directories, & file types on your site can be crawled by Google.
A crawlable page, is any page that Google or other Search Engines can crawl.
The robots.txt is a text file that sits on the root of your domain, you can see ours, and where it is located, in the link below:
This is literally a text file, hence the .txt at the end (in its filename extension). This, among other things, controls what pages Google (and other robots) should crawl.
Pages that cannot be crawled will not be indexed. Consequently, the robots.txt file needs to be formatted, served and structured properly.
Because you can specify file types or files with a character string present in them, or even entire directories… It’s easy to accidentally make some of your resources uncrawlable and thus very unlikely to be indexed.