g
Diving deep into Product Knowledge Panels

Diving deep into Product Knowledge Panels

Let’s take a first look at Product Knowledge Graph Panels and how they can impact the organic traffic of your eCommerce website. In this article I will share everything I learned on this type of rich results.

Before dissecting this new panel let’s try to define them. Product Knowledge Graph Panels are information boxes that appear on Google when you search for products that are in the Knowledge Graph. They were first introduced on mobile back in November 2017.

Much like any other Knowledge Panels a Product Knowledge Panel is meant to help you get a quick snapshot of information on a specific product-entity based on Google’s understanding of the available content on the web. 

product-knowledge-panel
Here is an example of an interactive filter on the size of a Weber Kettle (18, 22 or 23 inches?)
Anatomy of a Product Knowledge Panel

This is by far the most interactive rich result that I have ever seen. The information is automatically generated, comes from various sources across the web, and allows you to quickly get access to: 

  • Product Details (descriptions, features, material, size, weight and a lot more depending on the product) 
  • Link to the official product page (this is a link to the product page on the manufacturer’s website)
  • Reviews (all the reviews from different websites, organized around the most frequently asked questions – this is powerful but, hey no links here!) 
  • Price Comparison (the list of all the stores that sell online that product) 

Panels are updated as information changes on the web and the layout – depending on the product – might also include:

  • Personalized Ads (these are sponsored links that take you to the different online shops. I saw different behaviours here: ads that are separated from the panel – appearing before the panel or ads that are embedded in the “Stores” section) 
  • Product videos (these are introductory product videos mostly coming from YouTube that present the product)
  • Accessories (these are related products that can be purchased along with the main product – in the markup the property to be used is isAccessoryOrSparePartFor – this helps you link accessories of a main product) 
  • Critic reviews (these are direct links to an evaluation of the product written by an accredited author, it is important to use schema.org/Review or the review property nested inside schema.org/Product)

How to optimize your content for Product Knowledge Panels

A few things are very clear so far. 

1. Write product reviews and make sure to match the product that you are targeting

If you produce product reviews you have a new opportunity to gain qualified traffic as long as you use structured data markup for Product or for Review (that target a Product). Products can be reconciled using the GTIN code – this is a globally unique 14-digit number used that identifies trade items, products, or services. You can also add a sameAs link to the entity in Google’s Graph (i.e. the Weber Original Kettle 18” has its own Machine-Readable IDs kgmid=/g/11b6rg_zd5 that you can use in the schema markup). 

Accredited reviews might get featured and have direct links

2. Embed YouTube videos in articles using structured data

You can also get very visible by embedding in your articles YouTube videos using the VideoObject markup. Articles with embedded videos can get higher ranking than the canonical video itself (as long as the content that you produce adds value to the video) 👇

3. Use structured data markup on all your product pages (and if you are the manufacturer…don’t forget to use the Google Manufacturer Center)

All the manufacturers will get a direct link to their product pages and yes all of this might go at the expenses retailers that – on some queries at least – will be more likely forced to spend money on advertising.
Manufacturers can also send product data using the Google Manufacturer Center, one of the most immediate and direct ways to send authoritative products information to Google. Bing also has a similar functionality but at the time of writing the website wasn’t available.  

Adding structured data is essential and can help you earn the spot that you deserve. One example for all, if the panel features the “Accessories” tab; using schema markup will help Google create the link between the product articles that present the product’s accessories (hence driving more clicks to these websites) .

Here is an example of links that you can earn by helping Google understand a product’s ecosystem.

4. While critic reviews helps, customer reviews will not get you clicks

As of today user generated reviews are aggregated in the panel from multiple websites and do not have a link to the source. The content is very well organized (English only) around automatically generated queries that Google generates from all the reviews it acquires (the magical power of NLP). 

Conclusions and more insights

Product Panels are completely new and provide the searcher with an immersive experience on Google Search. They are designed as a one-stop shop for products that you can buy online. Google is most probably experimenting a lot with these new rich results (I see things changing frequently especially in relationship to product ads) and metrics are still volatile. 

Having said so I did a quick analysis on a manufacturer website on a cluster of 100 queries on Google.com in the US. First a disclaimer: 100 queries from a single website is a small sample and, while it cannot be considered statistically relevant, it does shed some light. 

Here is what I discovered: 

  • Panels tend to appear on the most trafficked queries; more than half of the queries that I analysed had a panel already!! 
  • Impressions, when there is a panel, tend to jump (+110.85% more impressions on average when compared to queries where there is not a panel) – remember this is a manufacturer so they are always present,
  • CTR goes down (-10.7% in comparison to queries with no panel) as we receive more impressions.
Product Knowledge Panel Impact as seen on GSC
KGPP (there is a Panel) – n.a. (there is no panel)

Curious to know more about Product Panels and how to improve organic traffic on your eCommerce website?
Let's talk!

The most important lessons learned

  1. Impressions have a growing importance in evaluating your business performances, as Google drives more engagement on its SERP, your site might get less clicks but you will still close the sales;
  2. To close the sales and to leverage on Knowledge Panels for Products you need to create (or aggregate) great content around products in the form of reviews, videos and top notch articles that compel users to purchase. A smooth online experience is made of multiple steps and these needs to be coherent and…well connected.

High quality structured linked data along with a consistent content model can  truly make a difference in helping Google understand your entities, their relationships and in the end this will help you sell more!  

Willing to review the schema markup on your eCommerce website? Get in contact with our SEO management service team now!

Google Entity Carousels: How to Earn Your Spot

Google Entity Carousels: How to Earn Your Spot

Google Carousels for ranked entities appear when you make searches like “top 10 seo tools”. These are entities presented as cards in Google SERP that can be used to further refine search results. 

Google Entity Carousel: a list of ranked entities belonging to the same category.

In this article I will share a use case and the lessons learned while managing to add back an entity in one of these carousels that had been missing for a long time.  

What is a Google carousel? 

A carousel is a list of entities that users can swipe through to discover entities that belong to the same category. It is highly used in local search results to help users find their favourite restaurant, a top rated hotel with a pool or the best night club. 

What is a host carousel? 

When multiple cards are presented from the same site, it is called host carousel. We can, on Google, use structured data to group entities belonging to the following types: Course, Movie, Recipe and Restaurant. There are two options when grouping content: summary pages (these use the schema ItemList class and ListItem that points to separate URLs) and a single, all-in-one-page list where all the elements are on the same page.  

A Google host carousel for recipes

Resuscitating a long gone entity in Google’s Graph

Let’s start with the beginning of the story. We have been contacted by Jean-Francois Rousset, the SEO Manager for the Stamford American International School in Singapore. Jean-Francois runs his own consultancy business (Blumark SA) and since last August he had been restlessly working on a branding issue that the school had way before involving him as SEO. The Stamford American International School had lost a prominent spot in a Google Entity Carousel for one of their most important queries: “international school singapore”. I liked the case and Jean-Francois liked the simplicity of WordLift that could help him improve structured data while focusing on this particular issue. 

Here is what this post is all about – in the carousel we also see the refinement chips: International school and Singapore.

Here below a quick glance at the keyword on Semrush; it was immediately clear to Jean-Francois and our team that having lost the spot in that specific carousel had a clear business impact

How do Google Carousels work? 

While their presence on the serp is quite relevant, little is known about these list-like rich results. Luckily Bill Slawski had written an article titled “Ranked Entities in Search Results at Google”  full of insights that helped us understand how things might actually work. 

According to a patent granted in June 2020, we read on Bill’s article, the flow – when Google receives a query like “international schools singapore” – might be as follow: 

from the query to the carousel

Entities are extracted from documents that are relevant for the query and, most importantly, a category is also derived from those documents. 

The first important information is that there is an interplay between pages ranking (the blue links) and entities in the carousel for that query. In our case both the school’s website and the related Wikipedia article were ranking high but still the entity was missing from the carousel. 

Google Knowledge Graph Reconciliation is a SEO Swiss Knife 

In context like the one I’m describing here where an entity, originally categorized as “international school” ceased to exist, our role as SEO and data publishers is to facilitate the challenging tasks of keeping information consistent and up to date in large scale graphs like Google and Bing. The correctness of data is crucial for all the parties involved (from Google to the University) and is key to provide reliable information to consumers worldwide across different platforms (Google Search and Google Maps for this specific example). 

Here is what we did to investigate the problem

1. Checking the entity in GKG

We started by checking if the entity was present in GKG using the GKG Search API (you can also use this Web App)

2. Getting the MREIDs for the entities involved 

As data gets extracted from multiple sources (unstructured text as well as schematized data) we might find multiple Machine-Readable Entity IDs (MREIDs) that refer to the same thing. In this case we found MREIDs for the entities in the GKG kg:/m/05t00r6 (from the Freebase era) and kg:/g/11fy24_ymv (a newly created entity).

3. Analyzing MREIDs 

By analyzing these MIDs we could find:

1. An old Knowledge Graph that was matched with an old Google My Business Account that Google had associated with the entity (previously created from the Wikipedia article of the University) and that could no longer be displayed in the carousel since the address had changed (see “Moved to a new location” below). 

Here is how the SERP was when we tried to open kgmid=/m/05t00r6

2. A new Knowledge Graph presumably related to the new Google My Business entity that the school had created when changing the address of the main campus.

Here is the information behind the new entity kgmid=/g/11fy24_ymv (with a KP still to be claimed)

The problem and the solution

A University is a localBusiness and Google My Business is, for all local businesses, the Entry Point In Google’s Knowledge Graph. The previously existing entity had been reconciled with a GMB that was no longer active and that Google could not use in the carousel.

“For local businesses, there’s a separate process of claiming [entities] that operates through Google My Business.”

– Danny Sullivan Google Blog 

Entity linking and disambiguation 

We immediately began by helping Google connect:

  • the two entities in its graph, 
  • the article in Wikipedia (and the related entity in Wikidata)
  • the social media presence (Facebook in particular was used as a source for the logo of the entity) and 
  • the official website

We did this by publishing structured linked data (using WordLift) using sameAs links and by contributing the same data to Wikidata.  Here is important to remember that we optimize the homepage to improve its position on the target query “international school Singapore” before improving the structured data.

Unfortunately we didn’t have the chance of resuming the old Google My Business account (it had been lost as different people got involved in its management). 

A few weeks later the entity was “magically” reintroduced in the carousel: it was a positive sign but the battle wasn’t over yet. Unfortunately, by selecting the resuscitated entity from the carousel, the old GMB would appear with the wrong address 😨.

At this point we decided to contact Google My Business support in order to make the last step: replacing the old GMB entity with the new one while preserving the connection with the old entity (the one that was originally created from the article in Wikipedia)

This was not a trivial task: GMB support does have the faculty of merging duplicates but, in our case, we needed to merge the old with the new GMB while preserving the link to the Wikipedia–entity as that entity was being used by Google Search on the carousel.   

It was clear to us that search features like the ranked list of entities is beyond the control that the GMB team has and we therefore emphasized the importance of improving data quality; they could not have a GMB entry for an institution without the link to a Wikipedia article when there is a Wikipedia article. This was indeed creating a messy experience for their users in the first place. 

The email we received from Google My Business Support

A few days later we got the response from GMB support that the mission was accomplished: the old entity had been merged with the new Google My Business entity.  

the final result

Lesson learned

  1. Google (via GMB support) can intervene on entities in the knowledge graph that are matched with entities created using Google My Business;
  2. The first level support of GMB will need to escalate the request to a higher level; 
  3. The best way to make your request escalate is to let the first Googler understand the importance of data-quality within your specific context; 
  4. Having a Wikipedia page might help you get in the right carousel but also makes it harder to manage your brand SERP (you have less control over it);
  5. Using structured linked data in combination with Wikipedia helps with entity disambiguation and interlinking;  
  6. The ten blue are used along with, entities extracted from these blue links, to build carousels, so don’t forget to do on-page SEO for the target query.  

Ready to earn your spot on Google’s Entity Carousel?

We have a VIP service that can resolve the most complex SEO issues. Contact our team of experts and learn how WordLift can help your rankings.

Credits 

As usual I am thankful to the wonderful community of partners, friends and users of our product and of our SEO management services. I am also very happy and thankful that – along the way – I could speculate and discuss the course of actions with Jason Barnard who has a terrific set of training course on Brand SERP SEO and last but not least I would like to thank Bill Slawski who, once again, was able to shed light on this intricate matter and help us win this battle.  

Schema Markup to Boost Local SEO

Schema Markup to Boost Local SEO

Is it really worth it? 

Let’s start with the end. In the experiment I am sharing today we measured the impact of a specific improvement on the structured data of a website that references 500+ Local Business (more specifically the site promotes Lodging Business such as hotels and villas for rent). Before diving into the solution; let’s have a look at the results that we obtained using a Causal Impact analysis. If you are a marketing person or an SEO you constantly struggle to measure the impact of your actions in the most precise and irrefutable way; Casual Impact, a methodology originally introduced by Google, helps you exactly with this. It’s a statistical analysis that builds a Bayesian structural time series model that helps you isolate the impact of a single change being made on a digital platform. 

Cumulative result achieved after the first week (click data exported from GSC).

In a week, after improving the existing markup, we could see a positive increase of +5.09% of clicks coming from Google Search – this improvement is statistically relevant, unlikely to be due to random fluctuations and the probability of obtaining this effect by chance is very small 🔥🔥

We did two major improvements to the markup of these local businesses: 

  1. Improve the quality of NAP (Name, Address and Phone number) by reconciling the entities with entities in Google My Business (viia Google Maps APIs) and by making sure we had the same data Google has or better;
  2. Adding, for all the reconciled entities, the hasMap property with a direct link to the Google CID Number (Customer ID Number), this is an important identifier that  business owners and webmasters should know – it helps Google match entities found by crawling structured data with entities in GMB. 

Problem Statement

Google My Business is indeed the simplest and most effective way for a local business to enter the Google Knowledge Graph. If your site operates in the travel sector or provides users with immediate access to hundreds of local businesses, what should you do to market your pages using schema markup against a fierce competition made of the business themselves or large brands such as booking.com and tripadvisors.com?

How can you be more relevant for both travelers abroad searching for their dream holiday in another country and for locals trying to escape from large urban areas?

Approach

The approach, in most of our projects, is the same regardless of the vertical we work for: knowledge completion and entity reconciliation; these really are two essential building blocks of our SEO strategy. 

By providing more precise information in the form of structured linked data we are helping search engines find the searchers we’re looking for, at the best time of their customer journey. 

Another important aspect is that, while we’re keen on automating SEO (and data curation in general), we understand the importance of the continuous feedback loop between humans and machines: domain experts need to be able to validate the output and to correct any inaccurate predictions that the machine might produce. 

There is no way out – tools like WordLift needs to facilitate the process and web scale it but they cannot replace human knowledge and human validation (not yet at least). 

Agentive SEO = Human-in-the-Loop 

The Solution

LocalBusiness markup works for different types of businesses from a retail shop to a luxury hotel or a shopping center and it comes with sub-types (here is the full list of the different variants from the schema.org website). 

All the sub-types, when it comes to SEO and Google in particular, shall contain the following set of information: 

  1. Name, Address and Phone number (and here consistency plays a big role and we want to ensure that the same entity on Yelp shows the same data on Apple Maps, Google, Bing and all the other directories that clients might use)
  2. Reference to the official website (this becomes particularly relevant if the publisher does not coincide with the business owner) 
  3. Reference to the Google My Business entity (the 5% lift – we have seen above is indeed related to this specific piece of information) using the hasMap property
  4. Location data (and here, as you might image, we can do a lot more than just adding the address as a string of text)

The JSON-LD behind a Local Business 

Here is the gist.

Google My Business reconciliation

In order to improve the markup and to add the hasMap property on hundreds of pages we’ve added a new functionality in WordLift’s WordPress plugin (that also works already for non-WordPress websites) that helps editors: 

  • Trigger the reconciliation using Google Maps APIs
  • Review/Approve the suggestions 
  • Improve structured data markup for Local Business
Google My Business Reconciliation by WordLift

From the screen below the editor can either “Accept” or “Discard” the provided suggestions. 

WordLift reconciles an entity with a loose match with the name of the business, the address and/or the phone number. 

Improving the name of the local business by adding a new alias, adding the hasMap and the International Phone  number

Adding location markup using containedInPlace/containsPlace and linked data

As seen in the json-ld above we have added – in a previous iteration (and independently from the testing that was done this time) two important properties:

  1. containedInPlace and 
  2. the inverse-property containsPlace (on the pages related to villages and regions) to help search engines clearly understand the location of the local businesses. 

This data is also very helpful to compose the breadcrumbs as it will help the searcher understand and confirm the location of a business. Most of us, still make searches like “WordLift, Rome” to find a local business and more likely we will click on results where we can confirm that – yes, WordLift office is indeed located in Italy > Lazio > Rome.

administrative divisions in GeoNames for rione Regola in Rome
The administrative divisions in GeoNames for the rione Regola in Rome where our office is located

To extract this information along with the sameAs links to Wikidata and GeoNames (one of the largest geographical databases with more than 11 million locations) we used our linked data stack and an extension called WordLift Geo to automatically populate the knowledge graph and the JSON-LD with the containedInPlace and containsPlace properties. 

Are you dealing with geographical data on your website? Want to learn more about WordLift GEO and local SEO? Contact us.

Conclusions

  • We have seen a +5.09% increase in clicks (after only one week) on pages where we added the hasMap property and improved the consistency of NAP (business name, address and phone number) on a travel website listing over 500+ local businesses
  • We did this by interfacing the Google Maps Places APIs and by providing suggestions for the editor to validate/reject the suggestions
  • Using containedInPlace/containsPlace is also  a good way to improve the structured data of a local business and you should do this by adding also sameAs links to Wikidata and/or GeoNames to facilitate disambiguation
    • As most of the searches for local businesses (at least in travel) are in the form of “[business name][location where the business is located]”; we have seen in the past an increased in the CTR when schema Breadcrumb use this information from containedInPlace/containsPlace (see below 👇)
Breadcrumbs using the administrative divisions from GeoNames

FAQs on LocalBusiness markup

One key aspect in SEO, if you are a local business (or deal with local business), is to have  the correct location listed in Google Maps and link your website with Google My Business.  The best way to do that is to properly markup your Google Map URL using schema markup. 

What is the hasMap property and how should we use it?
In 2014 (schema v 1.7) the hasMap property was introduced to link a web page of a place with the URL of a map. In order to facilitate the link between a web page and the corresponding entity on Google Maps we can use the following snippet in the JSON-LD “hasMap”: “https://maps.google.com/maps?cid=YOURCIDNUMBER”  

What is the Google CID number? 
Google customer ID (CID) is a unique number used to identify a Google Ads account. This number can be used to link a website with the corresponding entity in Google My Business.

How can I find the Google CID number using Google Maps?
Search the business in Google Maps using the business nameView the source code (use view-source: followed by the url in your browser)Click CTRL+F and search the source code for “ludocid”The CID will be the string of numbers after “ludocid\\u003d” and before #lrd. You can alternatively use this Chrome extension.

SERP Analysis with the help of AI

SERP Analysis with the help of AI

SERP analysis is an essential step in the process of content optimization to outrank the competition on Google. In this blog post I will share a new way to run SERP analysis using machine learning and a simple python program that you can run on Google Colab. 

Jump directly to the code: Google SERP Analysis using Natural Language Processing

SERP (Search Engine Result Page) analysis is part of keyword research and helps you understand if the query that you identified is relevant for your business goals. More importantly by analyzing how results are organized we can understand how Google is interpreting a specific query. 

What is the intention of the user making that search?

What search intent Google is associating with that particular query?

The investigative work required to analyze the top results provide an answer to these questions and guide us to improve (or create) the content that best fit the searcher. 

While there is an abundance of keyword research tools that provide SERP analysis functionalities, my particular interest lies in understanding the semantic data layer that Google uses to rank results and what can be inferred using natural language understanding from the corpus of results behind a query. This might also shed some light on how Google does fact extraction and verification for its own knowledge graph starting from the content we write on webpages. 

Falling down the rabbit hole

It all started when Jason Barnard and I started to chat about E-A-T and what technique marketers could use to “read and visualize” Brand SERPs. Jason is a brilliant mind and has a profound understanding of Google’s algorithms, he has been studying, tracking and analyzing Brand SERPs since 2013. While Brand SERPs are a category on their own the process of interpreting search results remains the same whether you are comparing the personal brands of “Andrea Volpini” and “Jason Barnard” or analyzing the different shades of meaning between “making homemade pizza” and “make pizza at home”. 

Hands-on with SERP analysis

In this pytude (simple python program) as Peter Norvig would call it, the plan goes as follow:

  • we will crawl Google’s top (10-15-20) results and extract the text behind each webpage
  • we will look at the terms and the concepts of the corpus of text resulting from the download, parsing, and scraping of web page data (main body text) of all the results together, 
  • we will then compare two queriesJason Barnard” and “Andrea Volpini” in our example and we will visualize the most frequent terms for each query within the same semantic space, 
  • After that we  will focus onJason Barnard” in order to understand the terms that make the top 3 results unique from all the other results, 
  • Finally using a sequence-to-sequence model we will summarize all the top results for Jason in a featured snippet like text (this is indeed impressive),
  • At last we will build a question-answering model on top of the corpus of text related toJason Barnard” to see what facts we can extract from these pages that can extend or validate information in Google’s knowledge graph.

Text mining Google’s SERP

Our text data (Web corpus) is the result of two queries made on Google.com (you can change this parameter in the Notebook) and of the extraction of all the text behind these webpages. Depending on the website we might or might not be able to collect the text. The two queries I worked with are “Jason Barnard” and “Andrea Volpini” but you can query of course whatever you like.   

One of the most crucial work, once the Web corpus has been created, in the text mining field is to present data visually. Using natural language processing (NLP) we can explore these SERPs from different angles and levels of detail. Using Scattertext  we’re immediately able to see what terms (from the combination of the two queries) differentiate the corpus from a general English corpus. What are, in other words, the most characteristic keywords of the corpus. 

The most characteristics terms in the corpus.

And you can see here besides the names (volpini, jasonbarnard, cyberandy) other relevant  terms that characterize both Jason and myself. Boowa a blue dog and Kwala a yellow koala will guide us throughout this investigation so let me first introduce them: they are two cartoon characters that Jason and his wife created back in the nineties. They are still prominent as they appear on Jason’s article on a Wikipedia as part of his career as cartoon maker.

Boowa and Kwala

Visualizing term associations in two Brand SERPs

In  the scatter plot below we have on the y-axis the categoryJason Barnard” (our first query), and on the x-axis the category for “Andrea Volpini”. On the top right corner of the chart we can see the most frequent terms on both SERPs – the semantic junctions between Jason and myself according to Google.

Not surprisingly there you will find terms like: Google, Knowledge, Twitter and SEO. On the top left side we can spot Boowa and Kwala for Jason and on the bottom right corner AI, WordLift and knowledge graph for myself.  

To extract the entities we use spaCy and an extraordinary library Jason Kassler called Scattertext.

Visualizing the terms related to “Jason Barnard” (y-axis) and “Andrea Volpini” (x-asix). The visualization is interactive and allows us to zoom on a specific term like “seo”. Try it.

Comparing the terms that make the top 3 results unique

When analyzing the SERP our goal is to understand how Google is interpreting the intent of the user and what terms Google considers relevant for that query. To do so, in the experiment, we split the corpus of the results related to Jason between the content that ranks in position 1, 2 and 3 and everything else.

On the top the terms extracted from the top 3 results and below everything else. Open the chart on a separate tab from here.

Summarizing Google’s Search Results

When creating well-optimized content professional SEOs analyze the top results in order to analyze the search intent and to get an overview of the competition. As Gianluca Fiorelli, whom I personally admire a lot, would say; it is vital to look at it directly.

Since we now have the web corpus of all the results I decided to let the AI do the hard work in order to “read” all the content related to Jason and to create an easy to read summary. I’ve experimented quite a lot lately with both extractive and abstractive summarization techniques and I found that, when dealing with an heterogeneous multi-genre corpus like the one we get from scraping web results, BART (a sequence-to-sequence text model) does an excellent job in understanding the text and generating abstractive summaries (for English).

Let’s it in action on Jason’s results. Here is where the fun begins. Since I was working with Jason Barnard a.k.a the Brand SERP Guy, Jason was able to update his own Brand SERP as if Google was his own CMS 😜and we could immediately see from the summary how these changes where impacting what Google was indexing.

Here below the transition from Jason marketer, musicians and cartoon maker to Jason full-time digital marketer.

Can we reverse-engineer Google’s answer box?

As Jason and I were progressing with the experiment I also decided to see how close a Question Answering System running Google , pre-trained models of BERT, could get to Google’s answer box for the Jason-related question below.

Quite impressively, as the web corpus was indeed, the same that Google uses, I could get exactly the same result.

A fine-tuning task on SQuAD for the corpus of result of “Jason Barnard”

This is interesting as it tells us that we can use question-answering systems to validate if the content that we’re producing responds to the question that we’re targeting.

Ready to transform your marketing strategy with AI? Let's talk!

Lesson we learned

We can produce semantically organized knowledge from raw unstructured content much like a modern search engine would do. By reverse engineering the semantic extraction layer using NER from Google’s top results we can “see” the unique terms that make web documents stand out on a given query.

We can also analyze the evolution over time and space (the same query in a different region can have a different set of results) of these terms.

While with keyword research tools we always see a ‘static’ representation of the SERP by running our own analysis pipeline we realize that these results are constantly changing as new content surfaces the index and as Google’s neural mind improves its understanding of the world and of the person making the query.

By comparing different queries we can find aspects in common and uniqueness that can help us inform the content strategy (and the content model behind the strategy). 

Are you ready to run your first SERP Analysis using Natural Language Processing?

Get in contact with our SEO management service team now!

Credits

All of this wouldn’t happen without Jason’s challenge of “visualizing” E-A-T and brand serps and this work is dedicated to him and to the wonderful community of marketers, SEOs, clients and partners that are supporting WordLift. A big thank you also goes to the open-source technologies used in this experiment:

The Ultimate Checklist to Rank in the Google Top Stories Carousel

The Ultimate Checklist to Rank in the Google Top Stories Carousel

Google Top Stories is a powerful way to boost mobile SEO and CTR of news content. In this article, we describe a real-world implementation, what it takes to be picked up by Google and how to measure the traffic impact.    

When Google first introduced the top stories carousel it had an immediate impact on the news and media industry that started to embrace the support for AMP pages. Top Stories are a modern, ultra-performing card-style design to present searchers with featured news stories in Google SERP.

Top Stories Carousel

Top Stories Carousel in Google Search

Getting featured is far from being a straightforward process as there are several requirements that need to be fulfilled and these requirements belong to different aspects of modern SEO: from AMP support, to Google News support (not required, while highly recommended), from structured data, to content editing, image preparation and page speed optimisation.

We take on a small handful of clients projects each year to help them boost their qualified traffic via our SEO Management Service

Do you want to be part of it?

Yes, send me a quote!

Let’s dive in and look at very basic by analyzing what we have done to bring this ultra-performing search feature to one of our SEO managed service clients. Before doing that, as usual, I like to show you the results of this work.

The effect of the top stories as seen from the Google Search Console.

The top stories news carousel is an ultra-performing SERP feature that strictly depends from your organic rankings.

Here is the checklist you need to follow to grab this mobile SEO opportunity.

1. Enable AMP

A top stories carousel is presented in the Google Developers Guide as a Search Feature that requires the implementation of AMP. So you need to support AMP on your website either as native or paired mode. Unless you are starting to develop a new project from scratch you are going to use AMP in paired mode. This basically means that you are reusing the active theme’s templates to display AMP responses. With this configuration, AMP uses a separate URLs, whether the canonical URLs for your site will not have AMP. You can use the AMP Test Tool to make sure that your pages comply with Google Search requirements for AMP.

1a. Comply with AMP logo guidelines

You need to make sure that the logo used to represent the publisher that is used in the structured data from AMP fits in a 60x600px rectangle, and either be exactly 60px high (preferred), or exactly 600px wide. A logo 450x45px would not be acceptable, even though it fits within the 600x60px rectangle.

Remember also when you have a logo with a solid background to include 6px minimum padding around it. Wanna see an example? Here is WordLift Publisher’s logo.

2. Use structured data to markup your articles

Google describes the news carousel as “a container that includes articles, live blogs, and videos” and what helps Google understand the content on the page is the required structured data. So the second step is to make sure that you are supporting one of the following schema types:

Article

You may also want to use one of these structured data testing tool to validate your markup.

2a. When in paired mode, make sure to have the same structured data on both canonical and AMP pages

Depending on how you are generating your AMP you might end-up, as it happened to several of our clients, with a different structured data markup on your canonical and AMP pages. This shall be prevented, it is inconsistent and can’t prevent Google from showing your articles in the top stories carousel (we learned the lesson the hard way). The indication about using the same markup is provided in the Google AMP guide.

WordLift is fully compatible with the AMP Plugin (developed by Google, Automattic, and XWP) and AMP pages can inherit the schema.org markup of the canonical page and share the same JSON-LD. Read all about how to add structured data markup to AMP here.

3. Use multiple large images in your markup

Google in the article schema guide for AMP articles requires to provide, in the structured data markup, images that are at least 1.200 pixels wide and that have 800.000 pixels in total. This is not all – the guides also specifies that for best results publishers shall provide multiple high-resolution images with the following aspect ratios: 16×9, 4×3, and 1×1.

image requirements for AMP

This was first spotted by Aaron Bradley and it is indeed extremely important!

3a. Keep the main content always at the center of your images

You can use a ‘universal image format‘ that will work across multiple devices and in all social circumstances with automatic cropping (square, 16:9 4:3 or 16:10). To do so, it’s essential to always keep the core of your image at the very center of it. See it here how Jason Barnard does it for his podcast 👇

Universal Image Format

How to design your images to work with automatic cropping.

3b. Be specific when describing your images

Alt text are important and should be as specific as possible in order to describe images to visitors who are unable to see them.

This is an essential aspect of accessible web design and it is also strategic for image SEO. Google strives for indexing and serving high-quality and accessible content to its users and we shall do our best to support this process.

We heard of a case, in the US, where a website did not appear on Top Stories until they improved the alt text on the featured images (the main image of the article). They were (as a lot of publisher do) re-using the title of the page as alt text.

While this might work, in some rare cases, it is not considered an accessible practice and should be avoided.

4. Remember that being part of Google News is not required but…it helps a lot!

Google can feature any article matching the above criteria in the top stories carousel based on its organic algorithmic selection but…the reality is slightly different. Let’s see why:

  1. The Top Stories Carousel is indeed a SERP feature that evolved from the Google News box and serves the same goal,
  2. While the main difference of the top stories carousel is that content is NOT restricted to outlets Google News approved in reality, as a result of the “fake news” scandal that exploded in November 2016, less-than-reliable sources (and smaller sites that are not in Google News) have been removed from the top stories carousel (NewsDashboard published data showing more than 99% of desktop news box results and 97% of mobile news box results are from Google News sites).

So unless you have the authority of Reddit, Yoast and alike there are much more chances for you to land in the news box if you are Google News approved. If you want to dig deeper on the relationship between Top Stories and Google News go follow this thread on the Google News Help Forum.

4a. Follow the editorial guidelines of Google News

Google provides news publishers with a set of content policies to ensure a positive experience for the readers. It is not only about being newsworthy and keep on writing fresh new content but it also about limiting advertising, preventing sponsored content, malicious links or anything that can be considered hateful, offensive or dangerous.

Here you can find all the editorial criteria to follow and the recommendations that Danny Sullivan from Google provided in this post titled “Ways to Succeed in Google News”

4b. Avoid article content errors

In order to be featured in Google News there are few technical aspects to be considered:

  1. Prevent article fragmentation. If you have isolated sentences that are not grouped together into paragraphs you might get an error and your article will be rejected from Google News.
  2. Write articles that are not too short and not too long. This basically means to write more than 80 words and prevent your pages from being too long to read. We usually see that between 600-800 words is a good match for a Google News article.
  3. Make sure to write headlines of maximum 110 characters.

Review all the article content errors that you need to avoid to be eligible for Google News.

4c. Focus on original reporting 

Google has updated its news algorithm to focus more on originality as is being reported on September 13th, 2019 by the New York Times: Google Says a Change in Its Algorithm Will Highlight ‘Original Reporting’. This has been also confirmed several times on the Google News Publisher Help Community.

“I would recommend doing more work, or more obvious work, on original reporting – fresh, new, original facts and information that isn’t published elsewhere.” Chris Andrews (Platinum Product Expert) answering a question related to Top Stories.  

5. Speed, Speed and again Speed

News readers want to be able to find fresh updates as fast as possible — and, especially on mobile people care a lot about the speed of a page. A top story is a mobile SERP feature that is purely organic-driven. If you get to the top 5 results of Google you can get it and it will be an extra boost for your traffic, if you are not top ranking you will not get your spot in the news carousel (in most cases at least).  Starting in July 2018, page speed has become a ranking factor for all mobile searches and this means that your website needs to be blazing fast.

How to track when you have been featured in the Top Stories

Tracking traffic generated from the Top Stories is not immediate and can only be done by looking at specific queries from the Google Search Console, using third-party tools like Semrush or RankRanger or look for specific patterns in Google Analytics.

The simplest way I found is to start from the Google Search Console by filtering results for Rich Results and AMP Articles.

Google Search Console configuration

Google Search Console configuration

When you see a spike, you can look from a mobile device the related keyword and hopefully found the matching article. Remember also that a given result might only occur in a specific country. This article here, for example, was only visible from Google in the US so we could only detect it by changing the territory in the Google Search preferences and using the incognito mode.

From Google Analytics we are also able to spot a top story by looking for a peak like the one below. As you can see that traffic, in most cases is only there for a 48-72 hours maximum.  

Google Analytics for the article that entered the carousel.

Given the relationship between Google News and Top Stories you might want to analyze these patterns by filtering top articles in Google News. This can be easily done in Google Analytics by knowing that  Incoming readers with referrers of ‘news.google.com’ or ‘news.url.google.com’ are from Google News.

Once again there are plenty of SERP feature optimization chances that we can leverage on when combining structured data with more traditional SEO factors and, they do create an enormous difference for your audience reach.

Article updated September 13th, 2019 following the change in the Google News Algorithm.

Get more exposure for your products on Google Search with WordLift Professional + E-commerce SEO

Get more exposure for your products on Google Search with WordLift Professional + E-commerce SEO

You have Successfully Subscribed!

Stand out on search in 2019. Get 50% off WordLift until January 7th Buy Now!

x