Cameos on Google in a Nutshell

Cameos on Google in a Nutshell

What is Cameos on Google?

Google Cameos is a new – invite-only – Google Search experiment to let people (that already appear in Google Knowledge Graph) record videos for answering simple questions related to their work and/or life experience.

Video-answers appear in Google Search right below the knowledge panel; the same answers are also presented on Google Discover and can pop up on the Google Assistant when a user asks something about that person.

Cameos on Google lets you be the authority on you. This is how Google explains it.

Here is how Google Cameos work

Quite simply, here is how it worked for me:

  1. I received an invitation by email
  2. I downloaded the Google Cameos APP on my phone (you can find it for both Android and iOS)
  3. Upon starting the app (and here is where the fun begins), it will start generating questions by looking at the information Google has in its Knowledge Graph; these questions are divided into two categories:
    For the fans (things that are more closely related to the information Google has about you)
    Trending topics (most frequent questions on topics that relates to you)
  4. Simply by choosing a question you can start recording and, if you like the preview, you can send the video and
  5. The video gets published in an hour or so on Google Search below the knowledge panel
  6. You get from the app a quick overview in terms of “Total Impressions” and “Watches”
Cameos on Google

Cameos on Google

What do you need to use Google Cameos?

Right now the experiment is limited and you will need to get an invite to partecipate but, here is what I did before getting the invitation.

  1. Get your name/entity in Google’s Knowledge Graph (not trivial but these days not so difficult either)
  2. Get your entity verified – if your name appears with a Knowledge panel you can get started with the verification process from there – alternatively you can you start from Posts on Google
  3. Make sure the content about you is always fresh and up to date (you can suggest edits on the information available on the Knowledge Panel)
From Cameos to Google Discover

From Cameos to Google Discover

Is Google trying again to become a Social Network?

Well, yes in a way the medium is similar, the user is entice to invest on his own personal branding and to engage with his/her audience on Google‘s channels. While we only see it happening for people in the Knowledge Graphit is easy to expect that anyone is already in Google’s Knowledge Graph one way or another. Try to ask Google to call a friend from Twitter and you will find yourself in the awkward position of accessing a phone number of a person that, yes you know, but who is not in your phone’s contact list, and all of this displayed with a nice-looking material card containing a photo of that person taken from…well, the web.

The 3 things I learnt from the Google Cameos experiment

  1. The SERP is getting richer and richer to let people interact with each others in all sort of ways;
  2. Google‘s interactions and activation channels are built on top of its ever-growing knowledge graph; the more data you provide the easier it gets for Google to let you connect with your audience – this is valid for a small business, an individual and a brand. In this particular case the most exciting piece of technology is the machinery used to generate the questions by looking at the data in the Knowledge Graph. Let me give you an example, since Google knows that I have co-founded several companies and I am currently holding a CEO position – my questions are all gravitating about being a CEO, starting up a company and acting as a founder. Generating novel questions from Knowledge Graph is one of these tech challenges the ML/DL community is very excited about; as a marketer this means that the more I expand the data related to my entities the more occasions I get of interacting with my audience;
  3. Google’s heavily investing on its own Walled Garden by providing an AI-driven communication platform where everyone can also buy ads – it’s interesting to see as these experiments on the “organic” front tend to have their own “paid” counterpart (think for instance of the lead generation that has been recently introduced on Google Ads) – this means that a lot is changing in the way that organic search work and the stronger your brand is the more chances you have in capturing users attention.

Now, it’s really time to get famous and start playing with Cameos ?

How to write meta descriptions using BERT

How to write meta descriptions using BERT

If you are confused about meta descriptions in SEO, why they are important and how to nail it with the help of artificial intelligence, this article is for you.  If you are eager to start experimenting with an AI-writer, read the full article. At the end, I will give you a script to help you write meta descriptions on scale using BERT: Google’s pre-trained, unsupervised language model that has recently gained great momentum in the SEO community after both, Google and BING announced that they use it for providing more useful results.     I used to underestimate the importance of meta descriptions myself: after all Google will use it only on 35.9% of the cases (according to a Moz analysis from last year by the illustrious @dr_pete). In reality, these brief snippets of text, greatly help to entice more users to your website and, indirectly, might even influence your ranking thanks to higher click-through-rate (CTR) While Google can overrule the meta descriptions added in the HTML of your pages, if you properly align:
  1. the main intent of the user (the query you are targeting), 
  2. the title of the page and
  3. the meta description 
There are many possibilities to improve the CTR on Google’s result pages. In the course of this article we will investigate the following aspects and, since it’s a long article, feel free to jump to the section that interests you the most — code is available at the end.

What are meta descriptions?

As usual I tend to “ask”  “experts” online a definition to get started, and with a simple query on Google, we can get this definition from our friends at WooRank: Meta descriptions are HTML tags that appear in the head section of a web page. The content within the tag provides a description of what the page and its content are about. In the context of SEO, meta descriptions should be around 160 characters long. meta description definition Here’s an example of what a meta description usually looks like (from that same article): meta description example

How long should your meta description be?

We want to be, as with any other content on our site, authentic, conversational and user-friendly. Having said that, in 2020, you will want to stick to the 155-160 characters limit (this corresponds to 920 pixels). We also want to keep in mind that the “optimal” length might change based on the query of the user. This means that you should really do your best in the first 120 characters and think in terms of creating a meaningful chain by linking the query, the title tag and the meta description. In some cases, within this chain it is also very important to consider the role of the breadcrumbs. As in the example above from WooRank I can quickly see that the definition is coming from an educational page of their site: this fits very well with my information request.         

What meta descriptions should we focus on?

SEO is a process: we need to set our goals, analyze the data we’re starting with, improve our content, and measure the results. There is no point in looking at a large website and saying, I need to write a gazillion of meta descriptions since they are all missing. It would simply be a waste of time. Besides the fact that in some cases – we might decide not to add a meta description at all. For example, when a page covers different queries and the text is already well structured we might leave it to Google to craft the best snippet for each super query (they are super good at it ?). We need to look at the critical pages we have – let’s not forget that writing a good meta description is just like writing an ad copy — driving clicks is not a trivial game. As a rule of thumb I prefer to focus my attention on: 
  • Pages that are already ranking on Google (position > 0); adding a meta description to a page that is not ranking will not make a difference.
  • Pages that are not in the top 3 positions: if they are already highly ranked, unless I can see some real opportunities – I prefer to leave them as they are.
  • Pages that have a business value: on the wordlift website (the company I work for), there is no point in adding meta descriptions to landing pages that have no organic potential. I would rather prefer to focus on content from our blog. This varies of course but is very important to understand what type of pages I want to focus on.
This criteria can be useful, especially if you plan to programmatically crawl our website and choose where to focus our attention using crawl data. Keep on reading and we’ll get there, I promise. 

A quick introduction to single-document text summarization

Automatic text summarization is a challenging NLP task to provide a short and possibly accurate summary of a long text. While, with the growing amount of online content, the need for understanding and summarizing content is very high. In pure technological terms, the challenge for creating well formed summaries is huge and results are, most of the time, still far from being perfect (or human-level). The first research work on automatic text summarization goes back to 50 years ago and various techniques. Since then, they have been used to extract relevant content from unstructured text.   “The different dimensions of text summarization can be generally categorized based on its input type (single or multi document), purpose (generic, domain specific, or query-based) and output type (extractive or abstractive).” A Review on Automatic Text Summarization Approaches, 2016.

Extractive vs Abstractrive 

Let’s have a quick look at the different methods we have for compressing a web page.  Extractive and Abstractive Summarization “Extractive summarization methods work by identifying important sections of the text and generating them verbatim; […] abstractive summarization methods aim at producing important material in a new way. In other words, they interpret and examine the text using advanced natural language techniques in order to generate a new shorter text that conveys the most critical information from the original text” Text Summarization Techniques: A Brief Survey, 2017. With simple words with extractive summarization we will use an algorithm to select and combine the most relevant sentences in a document. Using abstractive summarization methods, we will use sophisticated NLP techniques (i.e. deep neural networks) to read and understand a document in order to generate novel sentences.  In extractive methods a document can be seen as a graph where each sentence is a node and the relationships between these sentences are weighted edges. These edges can be computed by analyzing the similarity between the word-sets from each sentence. We can then use an algorithm like Page Rank (we will call it Text Rank in this context) to extract the most central sentences in our document-graph.  Text Rank algorithm

The carbon footprint of NLP and why I prefer extractive methods to create meta descriptions

In a recent study, researchers at the University of Massachusetts, Amherst, performed a life cycle assessment for training several common large AI models with focus on language models and NLP tasks. They found that training a complex language model can emit five times the lifetime emissions of the average American car (including whatever is required to manufacture the car itself!).  While automation is key we don’t want to contribute to the pollution of  our planet by misusing the technology we have. In principle, using abstract methods and deep learning techniques offers a higher degree of control when compressing articles into 30-60 word paragraphs but, considering our end goal (enticing more clicks from organic search), we can probably find a good compromise without spending too many computational (and environmental) resources. I know it sounds a bit naïve but…it is not and we want to be sustainable and efficient in everything we do.

What is BERT?

BERT: The Mighty Transformer 

Now, provided the fact that a significant amount of energy has been already spent to train BERT (1,507 kWh according to the paper mentioned above), I decided it was worth testing it for running extractive summarization  Bert from Sesame Street I have also to admit that It has been quite some time since I entertained myself with automatic text-summarization of online content and I have experimented with a lot of different methods before getting into BERT.   BERT is a pre-trained unsupervised natural language processing model created by Google and released as an open source program (yay!) that does magic on 11 of the most common NLP tasks. BERTSUM, is a variant of BERT, designed for extractive summarization that is now state-of-the-art (here you can find the paper behind it).  Derek Miller, leveraging on these progresses, has done a terrific work for bringing this technology to the masses (myself included) by creating a super sleek and easy-to-use Python library that we can use to experiment BERT-powered extractive text summarization at scale. A big thank you also goes to the HuggingFace team since Derek’s tool uses their Pytorch transformers library ?. 

Long live AI, let’s scale the generation of meta descriptions with our adorable robot [CODE IS HERE]

So here is how everything works in the code linked to this article.  Infograph of our AI
  1. We start with a CSV that I generated using the WooRank’s crawler (here you can tweak the code and use any CSV that helps you detect where on the site MDs are missing and where it can be useful to add them); the file provided in the code has been made available on Google Drive (this way we can always look at the data before running the script).
  2. We analyze the data from the crawler and build a dataframe using Pandas.
  3. We then choose what URLs are more critical: in the code provided I basically work on the analysis of the wordlift.io website and focus only on content from the English blog that has already a ranking position. Feel free to play with the Pandas filters and to infuse your own SEO knowledge and experience to the script.
  4. We then crawl each page (and here you might want to define the CSS class that the site uses in the HTML to detect the body of the article – hence preventing you from analyzing menus and other unnecessary elements in the page).
  5. We ask BERT (with a vanilla configuration that you can fine-tune) to generate a summary for each page and to write it on a csv file.
  6. With the resulting CSV we can head back to our beloved CMS and find the best way to import the data (you might want to curate BERT’s suggestions before actually going live with it – once again – most of the cases we can do better then the machine).
Super easy, not too intensive in computational terms and…environmentally friendly ? Have fun playing with it! Always remember, it is a robot friend and not a real replacement of your precious work. BERT can do the heavy lifting of reading the page and highlighting what matters the most but it might still fail in getting the right length or in adding the proper CTA (i.e. “read more to find …”).

Final thoughts and future work

The beauty of automation and agentive SEO is in general, as I like to call it, that you gain super powers while still remaining in full control of the process. AI is far from being magic or becoming (at least in this context) a replacement for content writers and SEOs, rather AI is a smart assistant that can augment our work.  There are some clear limitations with extractive text summarization that are related to the fact that we deal with sentences and if we have long sentences in our web page, we will end up having a snippet that is far too long to become a perfect meta description. I plan to keep on working to fine-tune the parameters to get the best possible results in terms of expressiveness and length but…so far only a 10-15% is good enough and doesn’t require any extra update from our natural intelligence. A vast majority of the summaries look good and it is substantial but still goes beyond the 160 character limits.  There is, of course, a lot of potential in these summaries beyond the generation of meta descriptions for SEO  – we can for instance create a “featured snippet” type of experience to provide relevant abstracts to the readers. Moreover, if the tone of the article is conversational enough, the summary might also become a speakable paragraph that we can use to introduce the content on voice-enabled devices (i.e. “what is the latest WordLift article about?”). So, while we can’t let the machine really run the show alone, there is a concrete value in using BERT for summarization. 

Credits

As you arrived to the end of this long article, it is time to remind us all that none of this could be possible without the work of many people and enlightened organizations that are committed to open source technologies and that are enabling and encouraging practitioners around the world to make (well, hopefully) the web a better place!  It is also thanks to mavericks and SEOs with a data-driven mindset like Paul Shapiro and Hamlet that I got interested in the topic and ready to experiment with new tools!  Give a spin to the code on the Google Colab and send me any comments or suggestions over Twitter or LinkedIn! Want to scale your marketing efforts with Woorank and WordLift SEO management service? I can’t wait to learn more about your challenges! 
The Ultimate Checklist to Optimize Content for Google Discover

The Ultimate Checklist to Optimize Content for Google Discover

The shift from keyword search to a queryless way to get information has arrived

Google Discover is an AI-driven content recommendation tool included with the Google Search app. Here is what we learned from the data available in the Google Search Console.

Google introduced Discover in 2017 and it claims that there are already 800M active users consuming content using this new application. A few days back Google added in the Google Search Console statistical data on the traffic generated by Discover. This is meant to help webmasters, and publishers in general, understand what content is ranking best on this new platform and how it might be different from the content ranking on Google Search.

What was very shocking for me to see, on some of the large websites we work for with our SEO management service, is that between 25% and 42% of the total number of organic clicks are already generated by this new recommendation tool. I did expect Discover to drive a significant amount of organic traffic but I totally underestimated its true potentials.

A snapshot from GSC on a news and media site

In Google’s AI-first approach, organic traffic is no longer solely dependent on queries typed by users in the search bar.

This has a tremendous impact on both content publishers, business owners and the SEO industry as a whole.

Machine learning is working behind the scenes to harvest data about users’ behaviors, to learn from this data and to suggest what is relevant for them at a specific point in time and space.

Let’s have a look at how Google explains how Discover works.

From www.blog.google

[…] We’ve taken our existing Knowledge Graph—which understands connections between people, places, things and facts about them—and added a new layer, called the Topic Layer, engineered to deeply understand a topic space and how interests can develop over time as familiarity and expertise grow. The Topic Layer is built by analyzing all the content that exists on the web for a given topic and develops hundreds and thousands of subtopics. For these subtopics, we can identify the most relevant articles and videos—the ones that have shown themselves to be evergreen and continually useful, as well as fresh content on the topic. We then look at patterns to understand how these subtopics relate to each other, so we can more intelligently surface the type of content you might want to explore next.

Embrace Semantics and publish data that can help machines be trained.

Once again, the data that we produce, sustains and nurture this entire process. Here is an overview of the contextual data, besides the Knowledge Graph and the Topic Layer that Google uses to train the system:

To learn more about Google’s work on query prediction, I would suggest you read an article by Bill Slawski titled “How Google Might Predict Query Intent Using Contextual Histories“.

What I learned by analyzing the data in GSC

This research is limited to the data gathered from three websites only, while the sample was small few patterns emerged:

      1. Google tends to distribute content between Google Search and Google Discover (the highest overlap I found was 13.5% – these are pages that, since Discover data has been collected on GSC, have received traffic from both channels)
      2. Pages in Discover have not the highest engagement in terms of bounce rate or average time on page when compared to all other pages on a website. They are relevant for a specific intent and well-curated but I didn’t see any correlation with social metrics.
      3. Traffic seems to work with a 48-hours or 72-hours spike as already seen for the top stories.

To optimize your content for Google Discover, here is what you should do.

1. Make sure you have an entity in the Google Knowledge Graph or an account on Google My Business

Entities in the Google Knowledge Graph need to be created in order for Discover to be able to recognize them.

Results for WordLift

Results for WordLift

For business owners

Either your business, or product, is already in the Google Knowledge Graph or it is not. If it is not, there are no chances that the content you are writing about for your company or product will appear in Discover (unless this content is bound to other broader topics). I am able to read articles about WordLift in my Discover stream since WordLift has an entity in the Google Knowledge Graph. From the configuration screenshot above we can actually see there are indeed more entities when I search for “WordLift”:

      • one related to Google My Business (WordLift Software Company in Rome is the label we use on GMB),
      • one from the Google Knowledge Graph (WordLift Company)
      • one presumably about the product (without any tagline)
      • one about myself as CEO of the company

So, get into the graph and make sure to curate your presence on Google My Business. Very interestingly we can see the relationship between myself and WordLift is such that when looking for WordLift, Google shows also Andrea Volpini as a potential topic of interest.

In these examples, we see that from Google Search I can start following persons that are already in the Google Knowledge Graph and the user experience in Discover for content related to the entity WordLift.

In these examples, we see that from Google Search I can start following persons that are already in the Google Knowledge Graph and the user experience in Discover for content related to the entity WordLift.

2. Focus on high-quality content and a great user experience

It is good also to remember that the quality in terms of both the content you write (alignment with Google’s content quality policies) and the user experience on your website is essential. A website that loads on a mobile connection in 10 seconds or more is not going to be featured in Discover. A clickbait article, with more ads than content, is not going to be featured in Discover. An article written by copying other websites and patently infringing copyrights laws is not likely to be featured in Discovery.

3. Be relevant and write content that truly helps people by responding to their specific information need

Recommendations tools like Discover only succeed when they are capable of enticing the user to click on the suggested content. To do so effectively they need to work with content designed to answer a specific request. Let’s see a few examples “I am interested in SEO” (entity “Search Engine Optimization“), or “I want to learn more about business models” (entity “Business Model”).

The more we can match the intent of the user, in a specific context (or micro-moment if you like), the more we are likely to be chosen by a recommendation tool like Discover.

4. Always use an appealing hi-res image and a great title

Images play a very important role in Google‘s card-based UI as well as in Discover. Whether you are presenting a cookie recipe or an article, the image you chose will be presented to the user and will play its role in enticing the click. Besides the editorial quality of the image I also suggest you follow the AMP requirements for images (the smallest side of the featured image should be at least 1.200 px). You also want to make sure Google has the rights to display your high-quality images and this can be done either using AMP or by by filling out this form to express your interest in Google’s opt-in program. Similarly, a good title, much like in the traditional SERP is super helpful in driving the clicks.

5. Organize your content semantically

Much like Google does, using tools like WordLift, you can organize content with semantic networks and entities. This allows you to: a) help Google (and other search engines) gather more data about “your” entities b) organize your content the same way Google does (and therefore measure its performance by looking at topics and not pages and keywords) c) train our own ML models to help you make better decisions for your business.

Let me give you a few examples. If I provide, let’s say the information about our company, and the industry we work for using entities that Google can crawl. Google‘s AI will be able to connect content related to our business with people interested in “startups”, “seo” and “artificial intelligence“. Machine learning, as we usually say, is hungry for data and semantically rich data is what platforms like Discover use to learn how to be relevant.

If I look at the traffic I generate on my website, not only in terms of pages and keywords but using entities (as we do with our new search rankings dashboard or the Google Analytics integration) I can quickly see what content is relevant for a given topic and improve it.

WordLift Dashboard

Use entities to analyze our your content is performing on organic search

Here below a list of pages, we have annotated with the entity “Artificial Intelligence“. Are these pages relevant for someone interested in AI? Can we do a better job in helping these people learn more about this topic?

A detail of the WordLift dashboard

A few of the articles tagged with the entity “Artificial Intelligence” and their respective query

Learn more about Google Discover – Questions & Answers

Following in this article, I have a list of questions that I have answered in these past days as data from Discover was made available in GSC. I hope you’ll find it useful too.

How does Discover work from the end-user perspective?

The suggestions in Discover are entity-based. Google groups content that believes relevant using entities in its Knowledge Graph (i.e. “WordLift”, “Andrea Volpini”, “Business” or “Search Engine Optimization“). Entities are called topics. The content-based user filtering algorithm behind Discover can be configured from a menu in the application (“Customize Discover”) and fine-tuned over time by providing direct feedback on the recommended content in the form of “Yes, I want more of this”, “No, I am not interested”. Using Reinforcement Learning (a specific branch of Machine Learning) and Neural Matching (different ways of understanding what the content is about) the algorithm is capable of creating a personalized feed of information from the web. New topics can be followed by clicking on the “+” sign.

Topics are organized in a hierarchy of categories and subcategories (such as “Sport”, “Technology”). Read more here on how to customize Google Discover.

How can I access Discover?

On Android, in most devices, accessing Discover is as simple as swiping, from the home screen to the right.

Is Google Discover available only in the US?

No, Google Discover is already available worldwide and in multiple languages and it is part of the core search experience on all Android devices and on any iOS devices with the Google Search app installed. Discover is also available in Google Chrome.

Do I have to be on Google News to be featured in Discover?

No, Google Discover uses also content that is not published on Google News. It is more likely that a news site will appear on Google Discover due to the amount of content published every day and the different topics that a news site usually covers.

Is evergreen content eligible for Discover or only freshly updated articles are?

Evergreen content, that fits a specific information need, is as important as newsworthy content. I spotted an article from FourWeekMBA.com (Gennaro’s blog on business administration and management) that was published 9 months ago under the entity “business”.

FourWeekMBA on Discover

Does a page need to rank high on Google Search to be featured in Discover?

Quite interestingly, on a news website where I analyzed the GSC data, only 13.5% of the pages featured in Discover had received traffic on Google Search. Pages that received traffic on both channels had a position on Google Search <=8.

Correlation of Discover_Clicks, Google Search_Position

Correlation of Google Discover Clicks and Google Search Position

How can I measure the impact of Discover from Google Analytics?

A simple way is to download the .csv file containing all the pages listed in the Discover report in GSC and create an advanced filter in Google Analytics under Behaviour > Site Content > All pages with the following combination of parameters:

Filtering all pages that have received traffic from Discover in Google Analytics

Filtering all pages that have received traffic from Discover in Google Analytics

Discover is yet another important step in the evolution of search engines in answer and discovery machines that help us sift in today’s content multiverse.

Keep following us, and give WordLift a spin with our free trial!

The Ultimate Checklist to Rank in the Google Top Stories Carousel

The Ultimate Checklist to Rank in the Google Top Stories Carousel

Google Top Stories is a powerful way to boost mobile SEO and CTR of news content. In this article, we describe a real-world implementation, what it takes to be picked up by Google and how to measure the traffic impact.    

When Google first introduced the top stories carousel it had an immediate impact on the news and media industry that started to embrace the support for AMP pages. Top Stories are a modern, ultra-performing card-style design to present searchers with featured news stories in Google SERP.

Top Stories Carousel

Top Stories Carousel in Google Search

Getting featured is far from being a straightforward process as there are several requirements that need to be fulfilled and these requirements belong to different aspects of modern SEO: from AMP support, to Google News support (not required, while highly recommended), from structured data, to content editing, image preparation and page speed optimisation.

We take on a small handful of clients projects each year to help them boost their qualified traffic via our SEO Management Service

Do you want to be part of it?

Yes, send me a quote!

Let’s dive in and look at very basic by analyzing what we have done to bring this ultra-performing search feature to one of our SEO managed service clients. Before doing that, as usual, I like to show you the results of this work.

The effect of the top stories as seen from the Google Search Console.

The top stories news carousel is an ultra-performing SERP feature that strictly depends from your organic rankings.

Here is the checklist you need to follow to grab this mobile SEO opportunity.

1. Enable AMP

A top stories carousel is presented in the Google Developers Guide as a Search Feature that requires the implementation of AMP. So you need to support AMP on your website either as native or paired mode. Unless you are starting to develop a new project from scratch you are going to use AMP in paired mode. This basically means that you are reusing the active theme’s templates to display AMP responses. With this configuration, AMP uses a separate URLs, whether the canonical URLs for your site will not have AMP. You can use the AMP Test Tool to make sure that your pages comply with Google Search requirements for AMP.

1a. Comply with AMP logo guidelines

You need to make sure that the logo used to represent the publisher that is used in the structured data from AMP fits in a 60x600px rectangle, and either be exactly 60px high (preferred), or exactly 600px wide. A logo 450x45px would not be acceptable, even though it fits within the 600x60px rectangle.

Remember also when you have a logo with a solid background to include 6px minimum padding around it. Wanna see an example? Here is WordLift Publisher’s logo.

2. Use structured data to markup your articles

Google describes the news carousel as “a container that includes articles, live blogs, and videos” and what helps Google understand the content on the page is the required structured data. So the second step is to make sure that you are supporting one of the following schema types:

Article

You may also want to use one of these structured data testing tool to validate your markup.

2a. When in paired mode, make sure to have the same structured data on both canonical and AMP pages

Depending on how you are generating your AMP you might end-up, as it happened to several of our clients, with a different structured data markup on your canonical and AMP pages. This shall be prevented, it is inconsistent and can’t prevent Google from showing your articles in the top stories carousel (we learned the lesson the hard way). The indication about using the same markup is provided in the Google AMP guide.

WordLift is fully compatible with the AMP Plugin (developed by Google, Automattic, and XWP) and AMP pages can inherit the schema.org markup of the canonical page and share the same JSON-LD. Read all about how to add structured data markup to AMP here.

3. Use multiple large images in your markup

Google in the article schema guide for AMP articles requires to provide, in the structured data markup, images that are at least 1.200 pixels wide and that have 800.000 pixels in total. This is not all – the guides also specifies that for best results publishers shall provide multiple high-resolution images with the following aspect ratios: 16×9, 4×3, and 1×1.

image requirements for AMP

This was first spotted by Aaron Bradley and it is indeed extremely important!

3a. Be specific when describing your images

Alt text are important and should be as specific as possible in order to describe images to visitors who are unable to see them.

This is an essential aspect of accessible web design and it is also strategic for image SEO. Google strives for indexing and serving high-quality and accessible content to its users and we shall do our best to support this process.

We heard of a case, in the US, where a website did not appear on Top Stories until they improved the alt text on the featured images (the main image of the article). They were (as a lot of publisher do) re-using the title of the page as alt text.

While this might work, in some rare cases, it is not considered an accessible practice and should be avoided.

4. Remember that being part of Google News is not required but…it helps a lot!

Google can feature any article matching the above criteria in the top stories carousel based on its organic algorithmic selection but…the reality is slightly different. Let’s see why:

  1. The Top Stories Carousel is indeed a SERP feature that evolved from the Google News box and serves the same goal,
  2. While the main difference of the top stories carousel is that content is NOT restricted to outlets Google News approved in reality, as a result of the “fake news” scandal that exploded in November 2016, less-than-reliable sources (and smaller sites that are not in Google News) have been removed from the top stories carousel (NewsDashboard published data showing more than 99% of desktop news box results and 97% of mobile news box results are from Google News sites).

So unless you have the authority of Reddit, Yoast and alike there are much more chances for you to land in the news box if you are Google News approved. If you want to dig deeper on the relationship between Top Stories and Google News go follow this thread on the Google News Help Forum.

4a. Follow the editorial guidelines of Google News

Google provides news publishers with a set of content policies to ensure a positive experience for the readers. It is not only about being newsworthy and keep on writing fresh new content but it also about limiting advertising, preventing sponsored content, malicious links or anything that can be considered hateful, offensive or dangerous.

Here you can find all the editorial criteria to follow and the recommendations that Danny Sullivan from Google provided in this post titled “Ways to Succeed in Google News”

4b. Avoid article content errors

In order to be featured in Google News there are few technical aspects to be considered:

  1. Prevent article fragmentation. If you have isolated sentences that are not grouped together into paragraphs you might get an error and your article will be rejected from Google News.
  2. Write articles that are not too short and not too long. This basically means to write more than 80 words and prevent your pages from being too long to read. We usually see that between 600-800 words is a good match for a Google News article.
  3. Make sure to write headlines of maximum 110 characters.

Review all the article content errors that you need to avoid to be eligible for Google News.

4c. Focus on original reporting 

Google has updated its news algorithm to focus more on originality as is being reported on September 13th, 2019 by the New York Times: Google Says a Change in Its Algorithm Will Highlight ‘Original Reporting’. This has been also confirmed several times on the Google News Publisher Help Community.

“I would recommend doing more work, or more obvious work, on original reporting – fresh, new, original facts and information that isn’t published elsewhere.” Chris Andrews (Platinum Product Expert) answering a question related to Top Stories.  

5. Speed, Speed and again Speed

News readers want to be able to find fresh updates as fast as possible — and, especially on mobile people care a lot about the speed of a page. A top story is a mobile SERP feature that is purely organic-driven. If you get to the top 5 results of Google you can get it and it will be an extra boost for your traffic, if you are not top ranking you will not get your spot in the news carousel (in most cases at least).  Starting in July 2018, page speed has become a ranking factor for all mobile searches and this means that your website needs to be blazing fast.

How to track when you have been featured in the Top Stories

Tracking traffic generated from the Top Stories is not immediate and can only be done by looking at specific queries from the Google Search Console, using third-party tools like Semrush or RankRanger or look for specific patterns in Google Analytics.

The simplest way I found is to start from the Google Search Console by filtering results for Rich Results and AMP Articles.

Google Search Console configuration

Google Search Console configuration

When you see a spike, you can look from a mobile device the related keyword and hopefully found the matching article. Remember also that a given result might only occur in a specific country. This article here, for example, was only visible from Google in the US so we could only detect it by changing the territory in the Google Search preferences and using the incognito mode.

From Google Analytics we are also able to spot a top story by looking for a peak like the one below. As you can see that traffic, in most cases is only there for a 48-72 hours maximum.  

Google Analytics for the article that entered the carousel.

Given the relationship between Google News and Top Stories you might want to analyze these patterns by filtering top articles in Google News. This can be easily done in Google Analytics by knowing that  Incoming readers with referrers of ‘news.google.com’ or ‘news.url.google.com’ are from Google News.

Once again there are plenty of SERP feature optimization chances that we can leverage on when combining structured data with more traditional SEO factors and, they do create an enormous difference for your audience reach.

Article updated September 13th, 2019 following the change in the Google News Algorithm.

Introducing Semantic Web Analytics

Introducing Semantic Web Analytics

In this article we are going to help you create a Web Analytics Dashboard using Google Data Studio, traffic data from Google Analytics and WordLift.

We constantly work for content-rich websites where sometimes hundreds of new articles are published on a daily basis. Analyzing traffic trends on these large properties and creating actionable reports is still time-consuming and inefficient. This is also very true for businesses investing in content marketing that need to dissect their traffic and evaluate their marketing efforts against concrete business goals (i.e. increasing subscriptions, improving e-commerce sales and so on).

As result of this experience, I am happy to share with you a Google Data Studio report that you can copy and personalize for your own needs.

google-data-studio Jump directly to the dashboard for Google Data Studio: Semantic Analytics by WordLift 

Data is meant to help transform organizations by providing them with answers to pressing business questions and uncovering previously unseen trends. This is particularly true when your biggest asset is the content that you produce.

With the ongoing growth of digitized data and the explosion of web metrics, organizations usually face two challenges:

  1. Finding what is truly relevant to untap a new business opportunity.
  2. Make it simpler for the business user to prepare and share the data, without being a data scientist.

Semantic Web Analytics is about delivering on these promises; empowering business users and let them uncover new insights – from the analysis of the traffic of their website.

We are super lucky to have a community of fantastic clients that help us shape our product and keep pushing us ahead of the curve.

Before enabling this feature, both the team at Salzburgerland Tourismus and the team at TheNextWeb had already improved their Google Analytics tracking code to store entity data as events. This allowed us to experiment, ahead of time, with this functionality before making it available to all other subscribers.

What is Semantic Web Analytics?

Semantic Web Analytics is the use of named entities and linked vocabularies such as schema.org to analyze the traffic of a website.

The natural language processing that WordLift uses to markup the content with linked entities enables us to classify articles and pages in Google Analytics with – real-world objects, events, situations or even abstract concepts.

How to activate Semantic Web Analytics?

Starting with WordLift 3.20, entities annotated in webpages can also be sent to Google Analytics by enabling the feature in the WordLift’s Settings panel.

WordLift Settings

Here is how this feature can be enabled.

You can also define the dimensions in Google Analytics to store entity data, this is particularly useful if you are already using custom dimensions.

As soon as the data starts flowing you will see a new category under Behaviour > Events in your Google Analytics.

Events in Google Analytics

Events in Google Analytics about named entities.

WordLift will trigger an event labeled with the title of the entity, every time a page containing an annotation with that entity is open.

Using these new events we can look at how content is consumed not only in terms of URLs and site categories but also in terms of entities. Moreover, we can investigate how articles are connected with entities and how entities are connected with articles.

Show me how this can impact my business

Making sense of data for a business user is about unlocking its power with interactive dashboards and beautiful reports. To inspire our clients, and once again with the help of online marketing ninjas like Martin Reichhart and Rainer Edlinger from Salzburgerland, we have built a dashboard using Google Data Studio – a free tool that helps you create comprehensive reports using data from different sources.

Using this dashboard we can immediately see, for each section of the website, what are the concepts driving the traffic, what articles are associated with these concepts and where the traffic is coming from.

An overview of the entities that drive the traffic on our website

An overview of the entities that drive the traffic on our website.

We can also see, what are the entities associated with a given article. Here below you can see the entities mentioned in the article: Implementing Structured Data for SEO with Bill Slawski.

Entities associated with an article

Entities associated with an article about structured data.

This helps publishers and business owners analyze the value behind a given topic. It can be precious to analyze the behaviors and interests of a specific user group. For example, on travel websites, we can immediately see what are the most relevant topics for let’s say Italian speaking and German speaking travelers.

WordLift’s clients in the news and media sector are also using this data to build new relationships with advertisers and affiliated businesses. They can finally bring in meetings the exact volumes they have for – let’s say – content that mentions a specific product or a category of products. This helps them calculate in advance how this traffic can be monetized.

Are you ready to make sense of your Google Analytics data? Contact us and let’s get started!

Here is the recipe for a Semantic Web Analytics dashboard in Google Data Studio 

With unlimited, free reports, it’s time to start playing immediately with Data Studio and entity data and see if and how it meets your organization’s needs.

To help with that, you can use as a starting point the report I have just created. Create your own interactive report and share with colleagues and partners (even if they don’t have direct access to your Google Analytics).

Simply take this report, make a copy, and replace with your own data!

Instructions

1. Make a Copy of this file

Go to the File menu and click to make a copy of the report. If you have never used Data Studio before, click to accept the terms and conditions, and then redo this step.

2. Do Not Request Access

Click “Maybe Later” when Data Studio warns you that data sources are not attached. If you click “Resolve” by mistake, do not click to request access – instead, click “Done”.

3. Switch Edit Toggle On

Make sure the “Edit” toggle is switched on. Click the text link to view the current page settings. The GA Demo Account data will appear as an “Unknown” data source there.

4. Create A New Data Source

If you have not created any data sources yet, you’ll see only sample data under “Available Data Sources” – in that case, scroll down and click “Create New Data Source” to add your own GA data to the available list.

5. Select Your Google Analytics View

Choose the Google Analytics connector, and authorize access if you aren’t signed in to GA already. Then select your desired GA account, property, and the view from each column.

6. Connect to Your GA Data

Name your data source (at the top left), or let it default to the name of the GA view. Click the blue “Connect” button at the top right.

Are you ready to build you first Semantic Dashboard? Add me on LinkedIn and let’s get started!

Read more about WordLift’s new Content Dashboard that combines entities with search rankings.

We take on a small handful of clients projects each year to help them boost their qualified traffic via our SEO Management Service

Do you want to be part of it?

Yes, send me a quote!

Stand out on search in 2019. Get 50% off WordLift until January 7th Buy Now!

x