- XML News sitemaps
- News Sitemap Benefits
- Example Code
- News Sitemap Tags
- Google Search Console
- Common Mistakes
XML news sitemaps are vital for the indexation of your news stories by search engines; they should be a concise list of all the news stories that you want to be indexed.
When we refer to ‘news ‘sitemaps’ we are really referring to ‘Google News Sitemaps’… These sitemaps provide the Google with meta data about your news stories and content.
This enables you or the site admin to have control over what stories are picked up by Google News... but does not on its own, mean that you will start ranking in Google News.
In short, the News Sitemap allows Google to more efficiently find your news articles.
Ensure the XML file is properly formatted and passes validation tests. Simply run the file through a sitemap validator to test. Once submitted to Google Search Console, Google will inform you of problems with the XML sitemap but it is better to pre-empt this with your own tests first.
The code below is an example of how a single entry should look in an XML news sitemap along with the opening code. It is worth noting that Google uses all of the tags in the code below:
<?xml version="1.0" encoding="UTF-8"?>
<news:name> Example brand name or publication</news:name>
<news:title>Example title of story</news:title>
<news:keywords>example, keywords, go, here</news:keywords>
News Sitemaps, like all other XML sitemaps, have a range of tags and attributes that act as meta data for your news content. We describe in more detail below what each of the above tag means and how they should be used.
This tag allows you to specify the name of the publication that is posting the news story:
The publication tag has two mandatory child tags / attributes:
- <name> = The publisher (e.g. The New York Times)
- The name must be the same as how it appears on your articles in news.google.com (omitting any trailing parentheticals)
- <language> = The language the article is written in ISO 639 Language Code (either 2 or 3 letters)
- The only exception for this is Chinese; in which case, you need to use zh-cn for Simplified Chinese or zh-tw for Traditional Chinese
These tags have no influence over the labels you have defined in the Publisher Centre.
Requirement: Required (where applicable)
This tag allows you to specify genres that your news story is in. This must be in the form of a comma-separated list and can include values such as the following:
Because of the influence this property has to affect the experience of a user, it is essential to accurately reflect your articles in Google News when using this tag.
This is where you specify the date of publication of the news story:
- This must be done in W3C format,
There are several options for the date format depending on how granular you would like to go:
- This is the complete date:
YYYY-MM-DD (e.g. 1997-07-16)
- This is the complete date, plus the hour and minute of publication:
YYYY-MM-DDThh:mmTZD (e.g. 2017-08-10T17:49+11:00)
- This is the complete date, plus the hour, minute and second of publication:
YYYY-MM-DDThh:mm:ssTZD (e.g. 2017-08-10T17:49:30+11:00)
- This is the complete date, plus the hour, minute and second of publication… With the second being counted to a decimal fraction:
YYYY-MM-DDThh:mm:ss.sTZD (e.g. 2017-08-10T17:49:30.45+11:00)
This is where you specify the title of the news story:
- You can truncate the title if it is too long.
- The title tag should include the title of the news article, as it is on your site
Please make sure not to include any of the following with the title tag:
- The author’s name
- The publication’s name
- The date of publication
This is where you can specify a list of relevant keywords that describe the topic of the news article:
- The keyword should be in the form of a comma-separated list
- Keywords can be taken from the list of existing Google News keywords, or you can make up your own!
This is where you can specify the stock exchange to which your financial news story refers:
- This must be in a comma-separated list
- You can use up to 5 stock tickers (These can be of companies, mutual funds, or other financial entities & institutions that are the principle topics of your news story)
- Relevant primarily for business articles.
- Each ticker must be prefixed by the name of its stock exchange, and must match its entry in Google Finance, for example:
- NASDAQ:AMAT (but not "NASD:AMAT")
- "BOM:500325" (but not "BOM:RIL")
The following guidelines should be reviewed and considered if you are building a News Sitemap for you site:
A News Sitemap is supposed to contain only news articles published in the last two days. Even if you remove articles that are older than this two-day limit sitemap, they will continue to remain in the Google News index for the standard 30-day period.
Google encourage you to update your News Sitemap “continually” with new news articles, as they're created and published. Google will crawl your News Sitemap/s as frequently as it crawls the rest of your site. Typically, the more frequently you publish, the more frequently Google will check back to see what has changed.
This and the first point (above) strongly indicate that you should keep your news sitemap up to date on a much more frequent basis than you would normally consider.
As with all XML sitemaps, there are limits on how big the file can be, in order to optimize crawl budget and improve indexation you should follow the recommendations below:
- News Sitemaps are to contain a maximum of 1,000 news stories / URLs.
- Larger sites may need to implement multiple Sitemaps, and use a Sitemap index / master file to manage them. You can use the XML format detailed in the Sitemap protocol.
- The Sitemap index / master file has a maximum limit of 50,000 Sitemaps
The limits set out above are there to help prevent your web server from being overloaded from serving large files to Google News.
This also helps to identify problems, if you notice that one sitemap is not getting its news stories indexed, you will be able to identify the problem. There are many benefits to splitting up sitemaps in this way.
Sitemap structure should reflect site structure, for example:
- Master Category (sitemap) – contains links to all other category sitemaps
- Category 1 (sitemap) – contains all products in this category
- Category 2 (sitemap) – contains all products in this category
Group similar content or content located in a similar location into sitemaps to create consistency and logical structure.
It sounds obvious but server your URLs from a consistent sitemap… what this means is that you simply update your sitemaps with more pages rather than adding entirely new additional sitemaps. Creating new sitemaps would create a whole range of unnecessary problems for you and your site and is against best practice.
- Use the Raptor Sitemap generator to create your news sitemaps and avoid all potential problems. Or use our auditor to identify any potential problems with your news sitemap.
We cover this in more detail in another part of the knowledge base that addresses SEO Tools and how to use them. But for now, it is worth noting that the news sitemap should be submitted to Google Webmaster Tools.
We have listed below the most common mistakes that we see in news sitemaps.
Ensure the sitemap is up to date and does not include duplicate entries or links to pages that are no longer present on the site. This can be an issue if you use a tool to create a sitemap and you do not filter out duplicate entries such as with trailing slashes and upper and lower case URL characters.
Do not include pages in the news sitemaps that you have disallowed from being crawled by the robots.txt; this wasted Google’s time and provides no benefit to you. In fact, if this happens on bulk it can also cause other problems; such as seeing increased volumes of news stories not being indexed.
Stipulating incorrect parameters such as ‘loc’ or ‘title’ can slow or impede proper indexing of the pages by Google.
Having too many links in a sitemap makes problem solving difficult when it comes to indexing errors, also if the total sitemap is too big it can cause performance issue.