Some Schema.org types are beneficial for most of the businesses out there. If you have a website you want to help search engines index its content in the most simple and effective way and to do that you can start from…well, the most important page: your homepage. Technical SEO experts like Cindy Krum describes schema markup (as well as XML feeds like the one that you can provide to feed Google Shopping via the Google Merchant Center) as your new sitemap. And it is true when crawling a website (whether you are Google or any other automated crawler you might think of), getting the right information about a website is a goldmine.
Let’s get started with our homepage. We want to let Google know from our homepage the following:
The organization behind the website (Publisher)
The logo of this organization
The URL of the organization
The contact information of the organization
The name of the website
The tagline of the website
The URL of the website
How to use the internal search engine of the website
The Sitelinks (the main links of the website)
We can do all of this by implementing the WebSite structured data type on the homepage of our website. A few more indications from Google on this front:
Add this markup only to the homepage, not to any other pages
🚨very important 🚨and unfortunately on a lot of websites, you still find this markup on every single page. It should not happen: it is unnecessary.
Always add one SearchAction for the website, and optionally another if supporting app search (if you have a mobile app – this will help users searching from a mobile device to continue their journey on the mobile app).
If you are a publisher you might have noticed that in 2019 Google introduced a new report called Discover. With the WordLift team, we’ve been looking into that to understand the dynamics of Google Discover for our VIP clients.
And in this article, I’m showing you the latest findings, what is Google Discover, and why it matters so much if you are in the publishing business (or if content marketing is your primary acquisition channel).
While it might have passed unnoticed for many, Google Discover has become a critical source of traffic for many others.
Let’s start from understanding what’s Google Discover and why it matters so much.
Discover is a popular way for users to stay up-to-date on all their favorite topics, even when they’re not searching. To provide publishers and sites visibility into their Discover traffic.
As highlighted in the launch of Google Discover there are two key elements to take into account when analyzing Google Discover:
Google wants users to stay up-to-date. It also wants to provide recommendations when they are not searching.
Therefore, Discover is a mechanism that enables users to find the most relevant content, in a specific timeframe, on their mobile device.
This is a revolution as it enables Google to move from pure search (where users need to type in keywords to retrieve information on the web) to a discovery mechanism where people can find effortlessly what they look for.
Why didn’t Google offer Discover way before?
For Google to make this step, in a scalable manner, it needed to develop two things: a database capable to hold a massive amount of information. And a powerful AI to be able to query this database to provide relevant information to users.
In 2019, both those technologies were finally available (at least for Google). The massive database is called a Knowledge Graph (Google has been building it since 2012).
And the AI able to query that database is now at the core of Google strategy (Google converted into an AI-first company already back in 2017).
In other words, if you were to master Google Discover you would enable your publishing business to double its reach in a relatively short timeframe (we’ve seen publishers doubling their Discover traffic in a few months).
So how do you do that? In the checklist below, Andrea Volpini has highlighted the key elements to take into account when optimizing for Google Discover:
I want you now to focus on the mindset you need to be able to grasp the opportunities around Google Discover.
Google Discover is also a new analytical tool which can help you unlock new insights
As Google launched Discover on the one hand. On the other hand, it also made visible to website owners a new report (with the same name) which enables them to look at the traffic coming from its Discover platform.
As Google specified:
We’re adding a new report in Google Search Console to share relevant statistics and help answer questions such as:
How often is my site shown in users’ Discover? How large is my traffic?
Which pieces of content perform well in Discover?
How does my content perform differently in Discover compared to traditional search results?
By looking at Discover you’ll find out a few important aspects:
Discover works more in a boom and bust cycle. It makes sense as Discover moves away from traditional search, as Google will be pushing your content on Discover if it is relevant at that moment for users that expressed interest on those topics.
You will notice also evergreen content entering Discover if it is interesting in the short term as there are more people searching for that topic.
If you speed up the creation of a shorter form of content as a companion for longer, potentially evergreen content (I’ll explain that in a second) you will improve your chances to be featured in Discover.
Understanding Discover: beyond search
Google Discover delivers a mixture of information based on users’ interests, coupled with what is trending and what Google believes is relevant for these users.
In short, Google wants to offer the most relevant content available for that user at that moment. This requires even a better understanding of your audience, which goes well beyond keywords alone.
For instance, with the WordLift team we put together a Semantic Dashboard which pulls up information based on the topics that your audience finds most interesting:
As you will notice from the video there are no keywords in this dashboard, as most of it is done by analyzing cluster of content, that in the Semantic Web jargon are called Entities. Therefore, the Semantic Analytics Dashboard will tell you what cluster of contents is actually providing traction to a broader concept.
It is important to highlight that Entities go well beyond keywords, because they represent concepts which are taken from a context, and disambiguated (clarified) to the search engines through structured data in the form of Schema Markup.
When you optimize your content. But also when you do the research to understand what kind of content to write about you want to look at all the key topics your audience has been looking at.
Thus, rather than just optimize for a single keyword you can structure your content by looking at the concepts to cover. This is important as you won’t see the optimization process as a one-time thing.
And this mindset will push you to create formats that are interesting for your audience. And that can be systematized for more efficient creation of content.
But wow do you understand what content goes into Discover?
Shorter-form vs. long-form content?
In the SEO community, there is a lot of discussion around the topic of short vs. longer-form content. The reality is that discussion doesn’t even matter and it completely misses the point.
A shorter or longer-form content will depend upon the search intent (another buzzword in the SEO world). In simple terms, if I’m asking “what is Google Discover” my intent might be purely informational.
Which makes me want to understand the topic at a more in-depth level, thus a longer piece might work. However, also for people searching for such a query, a good chunk of them might just want a simple, straight answer. Thus, they might look for the definition and then leave the article.
Thus, there isn’t a universal answer for that (it’s like asking to a what’s the perfect size of a screwdriver to an electrician!) and it doesn’t even matter!
Clarified this point, let’s move forward.
Short-term format as a companion for longer, evergreen content
Going back to Google Discover. If your site primarily publishes more in-depth pieces of content that are more suited for an audience which is looking for evergreen content, relevant in years to come you will might able to capture ongoing organic traffic as there always be an audience looking for that.
On the other hand, it also makes sense to create a short-term format that provides valuable information to your audience.
Thus, if you’re publishing “the ultimate guide to selling on Amazon” you might also want to have a format where each week you present a case study. That case study might be structured around what your audience is searching for in the last month or so.
In this way, you can enhance your chances to be featured in Google Discover.
Update your content frequently
As you know Google likes fresh content.
Oops sorry, I fell into the expert fallacy again!
It’s not Google that likes fresh content, people like it.
If Google were to serve an article about what’s the best restaurant in town, which is dated to 2017 that might be very disappointing for the user to find out that the restaurant doesn’t exist anymore.
Therefore, if you have pieces that might be come less relevant as they provide outdated information. Make sure to update them, change the publication date to signal Google that you made important changes to the article and index that again.
In this way, we facilitate the work for Google, which might serve it again, or keep serving it on Discover if it sees that content is still relevant for an audience.
With the WordLift team, we noticed is that Google might also offer it several times to the same Discover users. But in order to keep that content interesting for the Discover platform you need to make sure to update it with fresh information over time.
Ask your audience to add you as a source on Discover
Within Discover, users can decide what topics to follow, and what sources to include. Let your audience know they can add your publication as a source, so it becomes easier to be notified when relevant pieces are out.
This will give you more traction on the existing audience, and also presumably improve your authority within Discover. I don’t have any data to back that up, but it makes sense to me that if more people are adding you as a source within Discover, the more that might signal to Google that you are relevant on that vertical.
Build your knowledge graph
A Knowledge Graph is a semantic representation of your website. I like to call it a “modern sitemap or a sitemap of meanings.”
We’ve been helping since 2017 companies of any size building their knowledge graphs. With tools like WordLift that becomes extremely easy as we use natural language processing to match your content with what can potentially become an entity, thus giving you the chance to enrich semantically your content. Thus making it easier for Google to discover it, and offer it in several formats (be it a regular search, an advanced feature like a position zero, or inside Google Discover).
Google Discover is a new content distribution platform for publishers, and a new way to find relevant content for users.
It is slightly different from the way Google as always worked for two primary reasons. One, Discover enables users to find up to date, and relevant information on a feed rather than a search result page. Second, Google Discover moves away from the search logic, as it pushes content to users without them even looking for it.
In addition, Google Discover is also a new report within the Google Search Console which gives insights to publishers about the kind of content that Discover users consume.
That makes available to publisher a new massive distribution platform, and it creates new untapped opportunities if you take the time to implement the framework and checklist provided in this article.
As none of us is able to make sense of the Algorithm Changes that are happening with Google’s last Core update, yet we have dozens of clients askings us. We decided to ask the AI what’s happening with this update. If humans aren’t able to keep up with the machine let the machine tell us what’s going on.
Surprisingly we got some interesting answers (questions are from humans – answers come from a neural network we trained)!
When was the update released?
AI: Monday, June 3, here is a summary of what we know so far. Sources say the update was pre-announced and rolled out on June 3rd. It is still very early, but Google is tweaking its algorithm and is processing the data.
What are the core updates?
AI: Core Updates are Google Updates that do not have a clear specific focus on a certain kind of search query or particular website characteristics. Instead, they make more subtle changes “under the hood”. While other major, well-known Google Updates like Panda or Penguin specifically target things like content quality or backlinks, Core Updates tend to impact websites in a wide range of areas. These updates to Google’s core algorithm are released several times a year. The first Google Core Update was recorded by Searchmetrics in January 2016.
What can I do if my rankings collapse after a Google Update?
AI: At the end of 2018, Google provided some explanation of its Core Updates, stating that there aren’t any specific errors that webmasters can correct to recover lost rankings after a Core Update. Instead, Google recommends offering users the best-possible content – this is the best way of improving rankings. Google’s recommended first step towards achieving this is to study the Google Search Quality Rater Guidelines. This document is designed to help quality raters understand, in not inconsiderable detail, how to classify and evaluate content.
Can you tell us more about the DailyMail?
AI: The SEO director of the website even seems to have taken to Google’s help forums to report a huge drop, as spotted by the blog Search Engine Roundtable. He added that they saw their “Discover traffic drop by 90%” and it “has not improved.”
What sites have been impacted the most?
AI: As of now, the update’s made some major waves. As reported by Search Engine Land, early data from Sistrix, RankRanger, SearchMetrics, and Moz suggests there have been some noticeable fluctuations in the rankings. Sistrex reports changes from 05.06. on 06.06 in the Google UK Index. They also note that the range of websites affected seems to be wider than previous updates. While the most recent updates were thought to have the most effect on YMYL sites, these ones appear to have widened the net to news sites, retail sites, etc.
Is Google evil?
AI: Google says the new core update is “definitely noticeable.”
And since beginning its official rollout, that certainly seems to be the case.
How we did this
We trained a neural network using TensorFlow and a recently released large transformer-based language model called GPT-2 released by the team of OpenAI.
Are you Ready to Let the AI drive the traffic of you website?
SEO is a fantastic field to work on. There is always a new challenge to cope with and new things that we can learn to keep our traffic steady and to find the right audience. Core updates from Google are these events that shake the entire publishing and SEO industry as they can have a tectonic impact on traffic and search rankings, yet the dynamics of these updates remains obscure and can only be decoded after several weeks on a case-by-case basis.
If you own a blog or a website, you have probably heard of the conflict of bite sized content versus longer content. The former used to be popular in the early days of SEO but Google’s demands started changing that.
How much does the length of the content affect the rankings of a page? Is it really that important?
How to find a balance
For one, you need to understand that creating either long or short content is okay. As long as it provides interesting and engaging information, there is not much else to it.
Both the long form content and the short content have their own place on your website. You can find good use for both of them. Longer content is often connected with pillar posts, case studies, research articles or similar as well as ebooks or other additional content you offer on your site. Short content can be used as a quick blog update, event update, even a regular blog post but not on a very extensive topic.
Quality vs length
“Both short and long content can be quality but this is often not the case with short ones. They are often keyword stuffed and they miss the point of the topic” says Heath Aberdeen, marketing manager at Writemyx and Australia2write.
However, longer articles are usually the ones we all turn to when we need some help – the extensive guides, the how to articles or the listicles. So, is length important? In a sense, yes.
“Longer content is better because going in depth on a topic gives the readers more information, valuable knowledge and useful tips. It often contains statistics, comments by industry experts, graphs and tables that illustrate the point and so on” says Amy Shaw, content manager at 1Day2Write and Britstudent.
While Google definitely doesn’t use length as a ranking factors, some things that inherently come with longer content are ranking factors.
Google and length
Google never specifically stated that the length is something they consider. They look at other factors like the quality of backlinks or the keywords in your article. Anyway, it’s well known that articles with more content and information get more backlinks from quality sources.
Therefore, longer content gets more backlinks, more shares on social media – and readers also spend a lot more time on a page with long content, something Google is especially fond of.
Considering that more information means more content, Google could rank longer content better indirectly. When you look at almost any search results, you will notice that the first few – or even the whole page one sometimes – are always longer than 1500 words.
Other studies only confirmed the rule – longer content does rank better in search engines.
Appropriate content length
There is no single answer to this – some content is better left short, some deserves more space for discussion. However, the best advice marketers and writers can give you is to make your content as long as it needs to be. Strive for value rather than for length.
Google values content that users find useful. If you want to rank well, always focus just on providing that value and usefulness to your readers. While it’s statistically better for your content to be long, anywhere from 800 to 2500 words will do just fine.
Martha Jameson works as a content editor and proofreader. Before she became a writer at AcademicBrits.com and Originwritings.com she worked as a web designer and a manager. Her main goals are to motivate people to pursue their dreams.
We constantly work for content-rich websites where sometimes hundreds of new articles are published on a daily basis. Analyzing traffic trends on these large properties and creating actionable reports is still time-consuming and inefficient. This is also very true for businesses investing in content marketing that need to dissect their traffic and evaluate their marketing efforts against concrete business goals (i.e. increasing subscriptions, improving e-commerce sales and so on).
As result of this experience, I am happy to share with you a Google Data Studio report that you can copy and personalize for your own needs.
Data is meant to help transform organizations by providing them with answers to pressing business questions and uncovering previously unseen trends. This is particularly true when your biggest asset is the content that you produce.
With the ongoing growth of digitized data and the explosion of web metrics, organizations usually face two challenges:
Finding what istruly relevant to untap a new business opportunity.
Make it simpler for the business user to prepare and share the data, without being a data scientist.
Semantic Web Analytics is about delivering on these promises; empowering business users and let them uncover new insights – from the analysis of the traffic of their website.
We are super lucky to have a community of fantastic clients that help us shape our product and keep pushing us ahead of the curve.
Before enabling this feature, both the team at Salzburgerland Tourismus and the team at TheNextWeb had already improved their Google Analytics tracking code to store entity data as events. This allowed us to experiment, ahead of time, with this functionality before making it available to all other subscribers.
What is Semantic Web Analytics?
Semantic Web Analytics is the use of named entities and linked vocabularies such as schema.org to analyze the traffic of a website.
The natural language processing that WordLift uses to markup the content with linked entities enables us to classify articles and pages in Google Analytics with – real-world objects, events, situations or even abstract concepts.
How to activate Semantic Web Analytics?
Starting with WordLift 3.20, entities annotated in webpages can also be sent to Google Analytics by enabling the feature in the WordLift’s Settings panel.
Here is how this feature can be enabled.
You can also define the dimensions in Google Analytics to store entity data, this is particularly useful if you are already using custom dimensions.
As soon as the data starts flowing you will see a new category under Behaviour > Events in your Google Analytics.
Events in Google Analytics about named entities.
WordLift will trigger an event labeled with the title of the entity, every time a page containing an annotation with that entity is open.
Using these new events we can look at how content is consumed not only in terms of URLs and site categories but also in terms of entities. Moreover, we can investigate how articles are connected with entities and how entities are connected with articles.
Show me how this can impact my business
Making sense of data for a business user is about unlocking its power with interactive dashboards and beautiful reports. To inspire our clients, and once again with the help of online marketing ninjas like Martin Reichhart and Rainer Edlinger from Salzburgerland, we have built a dashboard using Google Data Studio – a free tool that helps you create comprehensive reports using data from different sources.
Using this dashboard we can immediately see, for each section of the website, what are the concepts driving the traffic, what articles are associated with these concepts and where the traffic is coming from.
An overview of the entities that drive the traffic on our website.
Entities associated with an article about structured data.
This helps publishers and business owners analyze the value behind a given topic. It can be precious to analyze the behaviors and interests of a specific user group. For example, on travel websites, we can immediately see what are the most relevant topics for let’s say Italian speaking and German speaking travelers.
WordLift’s clients in the news and media sector are also using this data to build new relationships with advertisers and affiliated businesses. They can finally bring in meetings the exact volumes they have for – let’s say – content that mentions a specific product or a category of products. This helps them calculate in advance how this traffic can be monetized.
Are you ready to make sense of your Google Analytics data? Contact us and let’s get started!
Here is the recipe for a Semantic Web Analytics dashboard in Google Data Studio
With unlimited, free reports, it’s time to start playing immediately with Data Studio and entity data and see if and how it meets your organization’s needs.
To help with that, you can use as a starting point the report I have just created. Create your own interactive report and share with colleagues and partners (even if they don’t have direct access to your Google Analytics).
Simply take this report, make a copy, and replace with your own data!
1. Make a Copy of this file
Go to the File menu and click to make a copy of the report. If you have never used Data Studio before, click to accept the terms and conditions, and then redo this step.
2. Do Not Request Access
Click “Maybe Later” when Data Studio warns you that data sources are not attached. If you click “Resolve” by mistake, do not click to request access – instead, click “Done”.
3. Switch Edit Toggle On
Make sure the “Edit” toggle is switched on. Click the text link to view the current page settings. The GA Demo Account data will appear as an “Unknown” data source there.
4. Create A New Data Source
If you have not created any data sources yet, you’ll see only sample data under “Available Data Sources” – in that case, scroll down and click “Create New Data Source” to add your own GA data to the available list.
5. Select Your Google Analytics View
Choose the Google Analytics connector, and authorize access if you aren’t signed in to GA already. Then select your desired GA account, property, and the view from each column.
6. Connect to Your GA Data
Name your data source (at the top left), or let it default to the name of the GA view. Click the blue “Connect” button at the top right.
Are you ready to build you first Semantic Dashboard? Add me on LinkedIn and let’s get started!
We had the opportunity to interview Bill Slawski, Director of SEO Research at Go Fish Digital, Creator and Author of SEO by the Sea. Bill Slawski is among the most authoritative people in the SEO community, a hybrid between an academic researcher and a practitioner. He has been looking at how search engines work since 1996. With Andrea Volpini we took the chance to ask Bill a few questions to understand how SEO is evolving and why you should understand the current picture, to keep implementing a successful SEO strategy!
When did you start with SEO?
Bill Slawski: I started doing SEO in 1996. I also made my first site in 1996. The brother of one of the people I worked on that site, she was selling computers for a digital equipment corp at that time., she sent us an email saying, “Hey, we just started this new website. You guys might like it.” It was the time in which AltaVista was a primary search engine. This was my first chance to see a search engine in action. My client said, “We need to be in this.” I tried to figure out how, and that was my first attempt at doing SEO!
After the launch of Google Discover, it seems that we live in a query-less world? How has SEO changed?
Bill Slawski: It has changed, but it hasn’t changed that much. I remember in 2007 giving a presentation in an SEO meetup on named entities. Things have been in the atmosphere. We just haven’t really brought them to the forefront and talked about them too much. Query-less searches example? You’re driving down the road 50 miles an hour, you wave your phone around in the air and it’s a signal to your phone asking you where you’re going. “Give me navigation, what’s ahead of us? What’s the traffic like? Are there detours?” And your phone can tell you that. It can say there’s a five-minute delay up ahead. You really don’t need a query for that.
What do you then, If you don’t need a query?
Bill Slawski: Well, for the Google Now, for it to show you search suggestions, it needs to have some idea of what your search history is like, what you’re interested in. In Google Now, you can feed it information about your interests, but it can also look at what you’ve searched for in the past, what you look like you have an interest in. If you want to see certain information about a certain sports team or a movie or a TV series, you search for those things and it knows you have an interest in them.
Andrea Volpini: It’s a context that gets built around the user. In one analysis that we run from one of our VIP customers, by looking at the data from the Google search console I found extremely interesting how it had reached 42%! You can see actually this big bump is due to the fact that Google started to account this data. This fact might be scaring a lot of people in the SEO industry. As, if we live in a query-less world, how do you optimize for it?
Can we do SEO in a query-less world?
Bill Slawski: They (SEO practitioners) should be happy about it. They should be excited about it.
Andrea Volpini: I was super excited. When I saw it, for me, it was like a revelation, because I have always put a lot of effort into creating data and metadata. Before we arrived to structure data, it’s always been a very important aspect of the website that we build. I used to build CMS, so I was really into creating data. But I underestimated the impact of a content recommendation through Google Discover when it comes to the traffic of a new website. Did you expect something like this?
Bill Slawski: If you watch how Google is tracking trends, entity search, and you can identify which things are entities by them having an entity type associated with them, something other than just search term, so you search for a baseball team or a football team and you see search term is one category associated with it, and the other category might be professional Chicago baseball team. The professional Chicago baseball team is the entity. Google’s tracking entities. What this means is when they identify interests that you may have, they may do that somewhat broadly, and they may show you as a searcher in Google Now in Discover things related to that. If you write about some things with some level of generalization that might fit some of the broader categories that match a lot, you’re gonna show up in some of those discovery things.
It’s like when Google used to show headers in search results, “Search news now,” or “Top news now,” and identify your site or something you wrote as a blog post as something fits top news now category, you didn’t apply to have that. You were a beneficiary of Google’s recommendation.
Andrea Volpini: Yes. When I saw this, I started to look a little bit at the data in the Google search console of this client and then another client and then another client again. What I found out by comparing these first sites is that Google is tending not to make an overlap with Google search and Discover, meaning that if it’s bringing traffic on Google search, the page might not be featured on Discover. The pages that are featured on Discover that are also on Google search as high ranking. But I found extremely interesting the fact that pages that didn’t receive any organic traffic had been discovered by Google Discover as if Google is trying to differentiate these channels.
Is this two-level search effect widening?
Bill Slawski: I think they’re trying to broaden, we might say, broaden our experience. Give us things that we’re not necessarily searching for, but are related. There’s at least one AI program I’ve worked with where it looks at my Twitter stream and recommends storage for me based upon where I’ve been tweeting. I see Google taking a role like that: “These are some other things they might be interested in that they haven’t been searching for. Let me show them to them.”
There’s a brilliant Google contributor video about the Semantic Search Engine. The first few minutes, he starts off saying, “Okay, I had trouble deciding what to name this video. I thought about The Discover Search Engine. Then I thought about A Decision Search Engine and realized Bing had already taken that. A Smart Search Engine. Well, that’s obvious.”
But capturing what we’re interested in is something Google’s seeming to try to do more of with the related questions that people also ask. We’re seeing Google trying to keep us on search results pages, clicking through, question after question, seeing things that are related that we’re interested in. Probably tracking every click that we make as to what we might have some interest in. With one box results, the same type of thing. They’ll keep on showing us one box results if we keep on clicking on them. If we stop clicking on them, they’ll change those.
Andrea Volpini: Where are we going with all of these? How do you see the role of SEO is changing? What would you recommend to an SEO that starts today, what should he become? You told us how you started in ’96 with someone asking you to be on AltaVista, and I remember AltaVista quite well. I also worked with AltaVista myself, and we started to use AltaVista for intranet.
What would you recommend to someone that starts SEO today?
Bill Slawski: I’m gonna go back to 2005 to a project I worked on then. It was for Baltimore.org. It was a visitor’s center of Baltimore, the conference center. They wanted people to visit the city and see it and see everything they had to offer. They were trying to rank well for terms like Baltimore bars and Baltimore sports. They got in their heads that they wanted to rank well for Baltimore black history. We tried to optimize a page for Baltimore black history. We put the words “Baltimore Black History” on the page a few times. There were too many other good sites which were talking about Baltimore’s black history. We were failing miserably to rank well for that phrase. I turned to a copywriter and I said, “There are great places in Baltimore to see they have something to do with this history. Let’s write about those. Let’s create a walking tour of the city. Let’s show people the famous black churches and black colleges and the nine-foot-tall statue of Billie Holiday, the six townhomes that Frederick Douglas bought in his 60s.
“He was an escaped slave at one point in time, came back to Baltimore as he got older and a lot richer and started buying properties and became a businessman. Let’s show people those places. Let’s tell them how to get there.”
We created a page that was walking tour of Baltimore. After three months, it was the sixth most visited page on that site, a site of about 300 pages or so. That was really good. That was successful. It got people to actually visit the city of Baltimore. They wanted to see those things.
Aaron Bradley ran this series of tweets the other day where one of the things he said was, “Don’t get worried about the switch in search engines to entities. Entities are all around us. They surround us. They’re everywhere. They’re everything you can write about. They’re web pages. They’re people. They’re places.”
It’s true. If we switch from a search based on words, on matching words, on documents to words and queries, we’re missing the opportunity to write about things, to identify attributes, properties associated with those things to tell people about what’s in the world around us, and they’re gonna search for those things. That’s a movement that search engine makes you, being able to understand that you’re talking about something in particular and return information about that thing.
Andrea Volpini: The new SEO should become basically a contextual writer, someone that intercept the intents and can create good content around it.
Is there something else in the profession of SEO in 2020?
Bill Slawski: One of the things I read about recently was something called entity extraction. Search engine being able to read a page, identify all the things that are on that page that are being written about, and all the contexts that surround those things, all the classes, all the … you see the example on the post I wrote about was a baseball player, Bryce Harper. Bryce Harper was a Washington National. Bryce Harper hits home runs. That’s the context. He’s hit so many home runs over his career. Having search engine being able to take facts on a page, understand them, and make a collection of those facts, compare them to what’s said on other pages about the same entities, so they can fact check. It can do the fact check in itself. It doesn’t need some news organization to do that.
Andrea Volpini: Well, this is the reason when we started our project, my initial idea was to create a semantic editor to let people create link data. I didn’t look at SEO as a potential market, but then I realized that immediately, all the interest was coming from, indeed, the SEO community. For instance, we created your entity on the WordLift website. This means that when we annotate the content with our tool, we have this permanent linked data ID. In the beginning, I thought it was natural to have permanent linked data IDs, because this was the way that the semantic web worked. But then I suddenly realized there is a very strong SEO effect in doing that because Google is also crawling this RDF that I’m publishing.
I saw a few months back that it’s actually a different class of IP that Google uses for crawling this data.
Do you think that it still makes sense to publish your own linked data ID, or it’s okay to use other IDs? Do you see value in publishing data with your own systems?
Bill Slawski: Something I haven’t really thought about too much. But it’s worth considering. I’ve seen people publishing those. I’ve tried to put one of those together, and I asked myself, “Why am I doing this? Is there gonna be value to it? Is it gonna be worthwhile?” But when I put together my homepage, a page about me, I wanted to try it, see what it was capable of, to see what it might show in search engines for doing that. Some of it showed some of it didn’t. It was interesting to experiment with and try and see what the rest of the world is catching onto when you do create that stuff.
Andrea Volpini: But this is actually how the entity of Gennaro Cuofano was born in the Knowledge Graph. We started to add a lot of reference in telling Google, “Here is Gennaro, is also authors of these books.” As soon as we injected this information into our Knowledge Graph and into the pages, for Google it was easier to collect the data and fact-check and say, “Okay, this is the guy that wrote the book and now works for this company,” and so on and so forth.
Gennaro Cuofano: and Google provided a Knowledge Panel with a complete description. It was something that before, it was not showing up in search, or at least it was just partial information. It felt like, by providing this kind of information, we allowed the search engine, actually Google, to have a better context and fact-check the information which gave it authority to the information that I provided.
Bill Slawski: Have you looked at Microsoft’s Concept Graph?
Andrea Volpini: Yes! It’s even more advanced. I found it more advanced in a way. It’s also very quick in getting the information in. We have a lot more easy experience when we are someone that wants to be in Bing because as soon as we put such data it gets it into the panel.
Bill Slawski: It surprised me because, for a while, stuff that Microsoft Research in Asia was doing was disappearing. They put together probates and it stopped. Nothing happened for a couple of years. It’s been revived into the Microsoft Concept Graph, which is good to see. It’s good to see they did something with all that work.
Gennaro Cuofano: Plus, we don’t know how much integration is also Bink and LinkedIn APIs
Andrea Volpini: It’s pretty strong! Probably the quickest entry in the Satori, the Knowledge Graph of Microsoft, is now for a person to be on LinkedIn, because it is like they’re using this information.
What other ways can we use the structure data currently for SEO?
Bill Slawski: One of the things I would say to that is augmentation queries. I mentioned those on the presentation. Google will not only look at queries associated with pages about a particular person, place or thing, but it will also query the log information and will look at structured data associated with the page, and it will run queries based upon those. It’s doing some machine learning to try to understand what else might be interesting about pages of yours. If these augmentation queries, the test queries that it runs about your page, tend to do as well as the original queries for your page in terms of people selecting things, people clicking on things. It might combine the augmentation query results with the original query results when it shows people them for your page.
New schemas from the latest version of Schema 3.5 is the “knows about” attribute. I mentioned with the knows about attribute, you could be a plumber, you could know about drain repair. Some searches will send you plumbers, and they expect to see information just about Los Angeles plumbers, they may see a result from a Los Angeles plumber that talks about drain repair. That may be exactly what they’re looking for. That may expand search results, expand something relevant to your site that you’ve identified as an area of expertise, which I think is interesting. I like that structured data is capable of a result like that.
What is your favorite new addition to Schema 3.5?
Bill Slawski: FAQ page!
On Schema.org there’s such a wide range. They’re gonna update that every month now. But just having things like bed type is good.
What do you think is the right balance when I add structured data to my pages between an over-complicated data structuring and simplicity?
Bill Slawski: I Did SEO for a site a few years ago that was an apartment complex. It was having trouble renting units. There was a four-page apartment complex, and it showed up its dog park really well. It didn’t show off things like the fact that if you took the elevator to the basement, you got let out to the DC metro where you could travel all throughout Washington DC, northern Virginia, and southern Maryland and visit all 31 Smithsonian, and a lot of other things that are underground, underneath that part of Virginia. It was right next to what’s called Pentagon City, which is the largest shopping mall in Virginia. It’s four stories tall, all underground. You can’t see it from the street. Adding structured data to your page to identify those is something you can do. It’s probably something you should include on the page itself.
Maybe you want to include information, more information, on your pages about entities and include them in structured data, too, in a way that is really precise. You’re using that language identified and Schema that subject matter experts describe as something people might want to know. It defines it well. It defines it easily.
What you’re saying is do what you do with your content with your data. If you put emphasis on an aspect content-wise, then you should also do the proper markup for it?
Bill Slawski: Right! With the apartment complex I was talking about, location sells. It gets people to decide, “This is where I want to live.” Tell them about the area around them. Put that on your page and put that in your data. Don’t show pictures of the dog park if you want to tell them what the area schools are like and what the community’s like, what business is around, what opportunities there are. You can go to the basement, this apartment complex, and ride to the local baseball stadium or the local football stadium. You’re blocks away. DC traffic is a nightmare. If you ride the metro line everywhere, you’re much better off…
Andrea Volpini: That’s big. Also metro in real estate, we say it, it’s always increased 30% the value of the real estate if you have a metro station close by. Definitely is relevant. Something that is relevant for the business should be put into consideration also when structuring the page.
Is it worth also exploring Schema which is not yet officially used by Google?
Bill Slawski: You can anticipate things that never happen. That’s possible. But sometimes, maybe anticipating things correctly can be a competitive advantage if it comes into fruition that it’s come about. You mentioned real estate. Have you seen things like walkability scores being used on realty sites? The idea that somebody can give you a metric to tell you where you can compare easily one location to another based on what you can do without a car, it’s a nice feature. Being able to find out data about a location could be really useful.
Andrea Volpini: This is why, getting back to the linked data ID, this is why having a linked data ID for the articles and the entities that describe the article become relevant because then you can query the data yourself, and then you can make an analysis of what neighborhood that the least amount of traffic, and see, “Okay, did I write about this neighborhood or not?” This is also one of the experiments that we do these days is that we bring the entity data from the page into Google Analytics to help the editorial team think about what traffic entities are generating across multiple pages. Entities in a way can also be used internally for organizing things and for saying, “Yes, in this neighborhood, for instance, we have the least amount of criminality” or things like that. You can start cross-checking data, not only waiting for Google to use the data. You can also use the data yourself.
Is there any other aspect worth mentioning about how to use structured data for SEO?
Bill Slawski:Mike Blumenthal wrote an article based upon something I wrote about, the thing about entity extraction. He said, “Hotels are entities, and if you put information about hotels, about bookings, about locations, about amenities onto your pages so that people can find them, so people can identify those things, you’re making their experience searching for things richer and more …”
Andrea Volpini: We had a case where we had done especially this for lodging business. We have seen that as soon as we have started to add amenities as structured data, and most importantly, as soon as we had started to actually add geographic references to the places that this location we’re in, we saw an increase, not in pure traffic terms. The traffic went up. But we also saw an interesting phenomenon of queries becoming broader. The site, before having structured data to the hotels and to the lodging business, received traffic from very few keywords. As soon as we started to add the structured data and typing amenities and services, we also added the Schema action for booking, we saw that Google was bringing a lot more traffic on long tail keywords for a lot of different location that this business had hotels in, but it was not being visible on Google.
Bill Slawski: It wasn’t just matching names of locations on your pages to names of locations and queries, it was Google understanding where you were located-
What do you think Schema Actions are useful for?
Bill Slawski: There was a patent that came out a couple of years ago where Google said, “You can circle an entity on a mobile device and you can register actions associated with those entities.” Somebody got the idea right and the concept wrong. They were thinking about touchscreens instead of voice. They never really rewrote that so that it was voice activated, so you could register actions with spoken queries instead of these touch queries. But I like the idea. Alexa has the programs, being able to register actions with your entities is not too different from what existed in Google before. Think about how you would optimize a local search page where you would make sure your address was in a postal format so that it was more likely to be found and used. Of course, you wanted people to drive to a location, you’d want to give them driving directions, and that’s something you can register in action for now, but it’s already in there. It feels like you’re helping Google implement things that it should be implementing anyway, or you’re likely to be.
Andrea Volpini: Of course. I think that’s a very beautiful point, that we’re doing something that we should do. We’re now doing it for Google, but that’s the way it should be done. I like it. I like it a lot.
How much do you think structured data’s gonna help for voice search?
Bill Slawski: I can see Schema not being necessary because of other things going on, like the entity extraction, where Google is trying to identify. But Google tends to do things in a redundant way. They tend to have two different channels to get the same thing done. If one gets something correct and the other doesn’t, it fails to, they still have it covered. I think Schema gives them that chance. It gives site owners a chance to include things that maybe Google might have missed. If Google captures stuff and they have an organization like Schema behind them, which isn’t the search engine, it’s a bunch of volunteers who are subject matter experts in a lot of places or play those on TV, some are really good at that. Some of them miss some things. If you are a member of the Schema community mailing list, the conversations that take place where people call people on things, like, “Wouldn’t you do this for this? Wouldn’t you do that? Why aren’t you doing this?” It’s interesting to read those conversations.
Andrea Volpini: Absolutely. I always enjoy the mailing list of Schema, because as you said, you have a different perspective and different subject matter expert that of course are in the need of declaring what their content is about. Yeah, I think that Schema, I see it as a site map for data. Even though Google can crawl the information, it always values the fact that there is someone behind that it’s curating the data and that might add something that they might have missed, as you say, but also give them a chance to come to check and say, “Okay, this is true or not?”
Bill Slawski: You want a scalable web. It does make sense to have editors curating what gets listed. That potentially is an issue with Wikipedia at some point in the future. There’s only so much human edited knowledge it’s gonna handle. When some event changes the world overnight and some facts about some important things change, you don’t want human editors trying to catch up as quickly as they can to get it correct. You want some automated way of having that information updated. Will we see that? We have organizations like DeepMind mining sites like the DailyMail and CNN. They chose those not necessarily because they’re the best sources of news, but because they’re structured in a way that makes it easy to find that.
What SEOs should be looking at as of now? What do they need be very careful about?
Bill Slawski:It would be not to be intimidated by the search engine grabbing content from web pages and publishing it in knowledge panels. Look for the opportunities when they’re there. Google is business, and as a business, they base what they do on advertising. But they’re not trying to steal your business. They may take advantage of business models that maybe need to be a little more sophisticated than “how tall is Abraham Lincoln? “You could probably build something a little bit more robust than that as a business model. But if Google‘s stealing your business model from you in what they publish on knowledge panels, you should work around its business model and not be intimidated by it. Consider how much of an opportunity it is potentially to have a channel where you’re being focused upon, located easily, by people who might value your services.