In this article, I will share some of the ways natural language processing and the combination of semantic web technologies and machine-learning can help you outsmart your competitors and gain a true SEO advantage.
We hear a lot about AI these days and what it can do to help business, social networks and large organizations improve their competitiveness. In this article, I will focus on how AI-powered SEO can be used to help publishers increase the level of engagement of their readership and boost the findability of their content.
How are search engines using Natural Language Processing?
Search engines are becoming capable of better understanding the intents of the searchers thanks to the continuous advancements of their linguistic AI capabilities.
From detecting synonyms to disambiguating previously unseen queries with the help of named entity recognition, part of speech tagging, named entity disambiguation and sentiment analysis, natural language processing, the research fields that focus on transforming natural language into machine-computable information, is playing a central role in the way commercial search engines like Google and Bing as well as personal digital assistants, process our requests, index websites and find relevant content on the Web.
While in the past search engines like Google worked with statistical models built around keywords and links, we’re now seeing semantic graphs and machine learning algorithms deeply influencing the quality of the results as well as the way these results are presented to the end-user.
From voice responses to featured snippets, from interactive widgets like the news carousel to zero-results SERP.
Try asking the Google Assistant “What is Semantic SEO?” and you will see the implication of an AI-first ecosystem where machines are trained with semantically-rich data to be able to answer using natural language rather than with a set of blue links.
Now that we’ve looked briefly at the evolution of search engines let’s move our attention to the other side of the coin: the content being published.
How can NLP help improve SEO?
NLP and semantic annotations help content being understood by machines. Adding semantic processing in a publishing workflow means using natural language processing to add a layer of semantically structured information that describes your content.
There are a number of ways that NLP is used today to improve SEO and user engagement. I will walk you through few use cases and some leading examples of advanced SEO strategies.
1. Structured Data Markup Automation#NLP can be used to classify your content and to publish structured data markup that describes your content. ? This will help search engines index your website more effectively. #SemanticSEO ? Click To Tweet
Here at WordLift, we call this method structured data markup automation. You can use tools like the Redlink Semantic Platform, Alchemy from IBM or the APIs of Bing to extract entities. These entities, and their unique identifiers, can be used to describe your content to search engines and… yes, this is exactly what WordLift does ? and you can see it in action on websites with millions of visitors such as thenextweb.com, windowsreport.com as well as established publishers like Reuters or the BBC.
Is structured data really helping website traffic grow?
Just recently Google shared three business cases to promote the usage of structured data and to educate webmasters in improving the quality of their schema.org markup. Results can be astonishing. See Eventbrite’s case study: their website experienced a 100% growth in organic traffic from Google Search to event listing pages.
“Within two or three weeks we started seeing a visual difference in our event search results on Google,” Allen Jilo, an Eventbrite product manager, says. “The Google Search experience definitely helps drive more eyeballs to event pages. And when those people convert, it translates to incremental ticket sales for our event creators.”
2. Internal link building and content discovery
In-links help users discover content from your website and they also help search engines evaluate what your content is about and how effective the user experience can be for a user that arrives for the first time on a particular webpage. A strong logical internal linking structure helps your SEO significantly.
Wanna learn more? I've written a guide on how to conduct an effective internal linking strategy. Have a look!Boris Demaria
With NLP and entity extraction algorithms, you can see what concepts can be detected by a machine. These algorithms are trained using machine learning techniques on large semantical databases extracted from Wikipedia or other openly available corpora of text. By looking at the list of extracted entities you, as a writer, might decide that your article deserves a contextual background and an introduction to some of the concepts that the NLP detected; this will help the reader (as well as search crawlers) understand “things” that they might not otherwise understand in-depth.
Just to give you an idea 10% of the searches that we make daily are meant to help us better understand things that we don’t know well. These searches usually are directed to Wikipedia or can be answered directly by the search engine with its knowledge graph panels. With NLP you can provide immediately this information to the reader without having him or her jump somewhere else.
We see a lot of these examples where NLP is used to create in-links that matter to the reader. Look for example at how The Guardian is using it in its articles to connect articles around “Russia” and “Vladimir Putin”.
3. Content recommendation
When content is annotated using natural language processing, the metadata is stored in a machine-readable format like JSON-LD, Microdata or RDF. Machine Learning is good at classifying information and predicting for instance, what the user will like to read next.
Content recommendations greatly improve what SEOs called dwell time, the time that users spend on a website between the click on a search result and the return back to the SERP.
The more the recommendations are good the more readers remain engaged with the content.
Adding a semantic layer of metadata to the content greatly improves the machine learning models that we can build to help the user jump from one article to another.
A great work in this area has been done by PoolParty and here you can find an interesting presentation on their Semantic Classifier and how it can help you create content recommendations that combine semantic enrichments produced by NLP with neuronal networks. The intersection between NLP, Semantic Graphs, and Machine Learning is also referred to as Semantic AI.
4. Smart redirections and 404s handling
This is a fairly narrow and yet very powerful mechanism that allows a website, like Quora, that is built around topics to route the user to the right topic by intercepting all the alternative names a concept might have.
You can see it in action by directing your browser to a topic page like this:
You will notice that your browser automatically redirects the request to the topic page for Search Engine Optimization located at the URL:
The web server of Quora has been configured to understand that “SEO” is equivalent for “Search Engine Optimization” and this is done by de-referencing the entity Search Engine Optimization in public knowledge graphs like DBpedia where all the synonyms for a given concept are described.
In other words for every topic page, by de-referencing each concept with the equivalent entity in large linguistic graphs, Quora is able to configure multiple 301 redirects to intercepts requests without having to worry about how each user is calling a specific concept. Yes, this configuration can be also easily implemented via WordLift ? when entities are detected by the NLP, WordLift de-reference them using Wikidata, DBpedia, Yago and other large semantic graphs.
5. Topic targeting
In recent years the attention has moved – at least for some SEO Experts – from targeting keywords to targeting topic clusters. As search engines are more capable of understanding the world around us and disambiguation comes into play the same results can be presented to the user across multiple searches that share the same intent; the competition is no longer on targeting a specific keyword but rather becomes about being relevant for a specific topic.
Relevancy, practically speaking, is achieved by expanding a topic in all the directions that might be of interest to our user.
With linguistic AI and word-vectors, we can start exploring a concept to see how it is semantically related to other concepts. This can guide us in building the proper context around it – there is a very interesting experiment we did to optimize Title tag SEO using TensorFlow that you should read to learn more about this technique.
If you want to start immediately playing with Word Vector I also suggest you spend some time playing with Google Semantris. You will see what machine learning can do when applied to semantics.
6. SERP Analysis with NLP
When you start analyzing multiple keywords and how they behave over time you basically look at the top 10 or 20 results for each keyword and how Google ranks the content behind each website. As keywords to track increase it becomes extremely complex to understand the trends behind all these web pages.
Back in 2013, as we were doing agency work for a Fortune 500 company, we started to use natural language processing across SERPs to get an immediate overview of what “entities” were driving these rankings and how the content was evolving as Google was updating its results on the target keywords.
I was very pleased to find a great presentation by Stephan Solomonidis that describes exactly this same process.
Conclusions and takeaways
Advanced SEO strategies powered by Semantic AI
NLP and entity extraction, as well as Semantic AI (the use of knowledge graphs and machine-learning), are heavily used today by large online properties like TheGuardian or TNW, as well as social networks like Quora. With tools like WordLift, these technologies can be immediately used on personal blogs, e-commerce websites and mid-sized content magazine to improve SEO and to boost user engagement.
APIs provided by Microsoft, IBM, Google or open source technology providers like Redlink can help SEOs in quickly reading content more effectively with the of an AI that can scan pages at the speed of light.
- Use NLP and large public graphs like Wikidata and DBpedia to improve the structured data markup on your pages
- Create relevant links and describe the topic that is relevant to your target audience
- Exploit semantically-rich metadata to improve the quality of the content recommendation on your site
- Configure smart redirections and 301s by de-referencing entities and expanding the synonyms of a given topic (so that users can always find the page they want on your website)
- Play with word-vectors to find inspirations on concepts that you might want to cover in order to become relevant for a specific topic 6. Analyse your competitors using NLP to quickly track what concept is driving the changes in the SERPs of Google.