Anywhere we look, there is an area where Artificial Intelligence is changing the rules of the game. From personal assistants that are changing the way we interact with machines; to self-driving cars and diagnostics systems that can diagnose certain diseases more accurately than doctors. That isn’t only due to the buzz of the media (that also contributes). This is due instead to the fact that AI is a vast area that touches several disciplines.
In this article, I want to show you how a field of Artificial Intelligence, called Natural Language Processing (NLP) helped Quora to become one of the most popular Q&A sites in the world. In fact, NLP has become so critical to Quora that it also has open vacancies for NLP engineers.
As WordLift was born as a university project from a Natural Language Processing research, we always look for best practices to see how the industry is evolving and helping the web become smarter.
Natural Language Processing applications
The main aim of NLP is to help computers’ program to process large amounts of natural language data by making sense of that. On platforms like Quora, with hundreds of millions of users keeping the quality of its content high is critical.
Hundreds of millions of people use Quora to discover high-quality answers to questions important to them. The quality of our content and the civility of our community are two important factors that make Quora special. We want to maintain that quality even as billions of people start using Quora.
The most effective way to be able to keep high-quality standards, while growing the user base is the ability to process that data in a way that makes it more valuable to its users. In fact, as explained further:
Such a rich dataset puts us in a unique position to use various Natural Language Processing (NLP) techniques to solve exciting problems critical to our success.
How’s Quora applying NLP? Here are 13 interesting ways.
Quality is critical for any platform to survive. For a Q&A platform like Quora, this is even more important. In fact, where users contribute all Quora content, how does it make sure to keep its quality high? First, we must define quality. Quora looks at things like writing style, readability, completeness, and trustworthiness.
This is a ranking issue. In fact, Quora has to look at many variables and give it a ranking based on its relevance and helpfulness such that most helpful answers show at the top.
To define the relevance an helpfulness Quora it needs to have five properties:
Answers the question that was asked.
Provides knowledge that is reusable by anyone interested in the question.
Answers that are supported by rationale.
Demonstrates credibility and is factually correct.
Is clear and easy to read.
Quora also gives an example of how it uses Natural Language Processing to extract relevant data to assess and rank answers:
As you can see Quora looks at various things; most of those are in the form of text. However, the text needs to be converted into data. Once that content becomes structured data that is how NLP helps turn that text in machine-readable data easily processed by its algorithms.
This gives Quora the opportunity to be more sophisticated in creating ranking systems by considering things such as author credibility, formatting, upvotes, and many other variables.
The flip of the coin of answers’ quality is questions’ quality. In fact, if you know Quora chances are you found it through Google’s search. In fact, if we look at its marketing mix more than 80% of traffic comes from search:
That’s because Quora is well positioned on so-called long-tail keywords that allow Quora to take over the SERP. Of course, this is also thanks to the fact that Quora can provide quality content to those answers. Yet, if Quora didn’t use a process driven by AI and machine learning, it would have been impossible to leverage on such a mole of natural language data.
That is why Quora uses the same ranking system that we saw above also to assess the relevance of Questions.
When you type something in Quora’s box, that has several functions:
In fact, this isn’t only a Q&A tool that allows anyone to ask something but also a way to search anything on the platform. You might think that the retrieval of information from that search box is mainly based on keyword matching. However, that is not the case as specified by Quora engineering team:
We use NLP techniques in this information retrieval problem space to help us better understand user queries and questions, as well as better rank content in the form of questions, answers, topics and user biographies. Unlike regular search engines with simple keyword matching, we can also support searches done with longer queries that are in the form of questions well.
Other NLP applications that make Quora smarter
When you transform the text into structured data, suddenly that knowledge which before was only accessible to humans becomes easily accessible to machines. Natural language processing helps make that transition, which translates human text in machine-readable data that can be fed to a system to make it more relevant for its users.
In this article, we saw how Quora uses NLP in three key areas. However, that is just the beginning. There are other areas in which NLP is crucial for Quora’s success:
Automatic Grammar Correction
Duplicate Question Detection
Related Question Generation
Topic Biography Quality
Automatic Answer Wikis
Hate SpeechHarassment Detection
Question Edit Quality
At WordLift we also use NLP to automate an important part of the digital marketing strategy, the so-called SEO. This article wanted to show you all the practical way in which AI is helping startups to build smarter systems that become more useful to their users beyond the buzz and hype created by media.
If you want to try NLP on your website, book a demo and let’s talk about your project. ?
Schema markup is significant for anyone that wants to contribute to today’s web.
Many think of structured data as the future. Yet it is the present. Many big players understood the importance of structured data a few years ago already. Now small players have the chance to take advantage of those technologies to win the SEO game.
When you go to schema.org you’ll see the initiative was launched and founded by Google, Microsoft, Yahoo, and Yandex. It is not a case those big players contributed to the foundation of the Semantic Web. It was the most logical step in the evolution of search engines.
Facebook, Google, and Amazon spent the last years building their knowledge graphs. You might think that is irrelevant for a small player. Yet it’s not. In a world where big players control the web, it is crucial to guarantee that each of us can contribute to the development of the internet. Structured data and Knowledge graphs are the tools that allow anyone to build today’s web.
Now those technologies are available to anyone with tools like WordLift.
It doesn’t matter how well-written is your content. If machines can’t read it, humans can’t find it. If people can’t find it, your content will get buried in the organic search. Thus, no one will know you ever wrote it.
Using Schema markup is like translating that content for search engines. Eventually, they’ll be able to find it more easily. Also, through Schema markup search engines will be able to read better and interpret the user’s question to locate the content he/she is looking for. That is how your content goes from nada to armada in a few clicks.
In this way, rather than having your piece of content buried in the organic search, you’ll make it easier to find and more prone to be suggested to a user.
2. Who finds the right related article finds a treasure
Slow content doesn’t ride news waves, but it can ride the longer, smoother industry waves to create stellar, valuable content that is quoted and cited for years to come.
One huge issue that any website or online magazine faces is the lack of essential interconnectedness between one piece of content with the other. Often you land on a cool website. Read an interesting article, just to find at the end of it a “related article” widget. That widget too often points toward a plethora of junk items (from how to get tanned, to which razor to use when shaving!). For dismal ad earnings, many websites and magazines are losing their opportunity to build quality traffic that will make them successful in the long-run.
With Schema markup you can connect each piece of content to the next semantically, meaningfully and consciously. Therefore making it easier to navigate to the user.
3. Let me hang out with your website
If you want your prospects and customers to think of your brand as exceptional, you have to deliver exceptional experiences with content. Every time. At every touch point. Period.
We often think of a website as a static place. The user clicks in, scrolls a bit, then clicks out. Our perspective is quickly changing. With the rise of voice search, websites become interactive. More and more people will want to hang out with a website. Rather than just click, tap and scroll; users want to ask, argue and talk. In this scenario Schema markup, and Knowledge Graphs are the tools that allow machines to read, and manipulate enough connected information to extrapolate knowledge that can answer the reader’s questions.
For instance, when I meet new people. They often ask me “what’s your job?” and I say “I’m BizDev at WordLift.” Next thing they ask is “what is WordLift?”
What if your website could interact at the same way with a user? It turns out it can, and our website already does!
Wrapping up and Conclusions
Structured data is what can bring your business to the next level. In fact, through that you can achieve several goals:
Make your content machine-friendly. Therefore, easier to find for your tribe
Let users navigate within the website by finding meaningful content. Thus, allowing them to create a long-term relationship with your website
Have your website ready to talk. In this way, it won’t be a static page but a place your tribe would love to hang out
In conclusion, people in the future won’t ask how many clicks your website is getting but rather if that is a place people would love to hang out. Is your website that place?
If you want to know more about these topics…
Join our webinar on machine-friendly content with Scott Abel ?
September 14th at 5:00 pm Central European Time (8:00 am PT / 11:00 am ET)
When I started to write on the web I realized I was supposed to change my writing style if I wanted search engines to understand what I wrote. Yet semantic technologies allow our stories to be understood by search engines even though their authors manage to write them for humans only. How is that possible? Read on…
Once upon a time a crow was sitting on a branch of a tree with a piece of cheese in her beak when a Fox observed her and set his wits to work to discover some way of getting the cheese. Coming and standing under the tree he looked up and said,
“What a noble bird I see above me! Her beauty is without equal, the hue of her plumage exquisite. If only her voice is as sweet as her looks are fair, she ought without doubt to be Queen of the birds.”
The Crow was hugely flattered by this, and just to show the Fox that she could sing she gave a loud caw. Down came the cheese, of course, and the Fox, snatching it up, said
“You have a voice, madam, I see: what you want is wits.”
Written around the sixth century B.C. presumably by Aesop. Even though we don’t know for sure if Aesop ever existed. Those fables are a powerful narrative device. If you ever heard any of those stories, chances are you never forgot. Why?
Human Memory in a Nutshell
Human memory is slightly different from a computer’s memory. In short, we know there are certain parts of the human brain that play a key role in forming memories. Yet those regions work in unison with other parts of our brains to make those memories formed.
Like a symphony. Our memories get shaped by the interactions between billions of neurons, in a wild dance in which those nerve cells make synaptic connections. Those connections also become memories. The more neurons fire together, the more those thoughts become sticky. Psychologists like to summarize this process in neurons that fire together wire together.
If we looked at your brain with an MRI scanner – think of the MRI as a camera that takes pictures of your brain in action – while listening to a story, you would see several regions of your brain lighting up.
Although ancient people didn’t have MRI scanners, they understood what happened behind the scenes of the human brain. They already grasped the importance of metaphors as devices to create memories. Those memories could last for decades and get spread for millennia to come. Jumping from brain to brain they got tied together by metaphors. Yet it seems we have forgotten this lesson in the last two decades, when search engines took over the web. Ever since, keywords ruled over content.
Aesop the Blogger
Imagine Aesop was a literary author in today’s world. He has this great book in his hands, yet no big publishing house willing to sponsor it. What to do?
Aesop could self-publish that literary work. Yet he wants to test and see if there is an audience for it. Thus, Aesop creates a WordPress blog. That is the most logical choice, as he doesn’t have any coding skills – coming from Ancient Greece. Opening for the first time his internet browser he finds himself in a weird place. That place’s name is Google. From gurus to marketers Aesop is looking for inspiration. He dives in an article that tells him how to start a blog in ten steps, three somersaults, and a loop-de-loop. He seems very confused.
Finally, Aesop finds an article from a bold and bald man. That article has a magic formula inside. That is the formula to becoming the most known author in the Googlesphere. Taken aback by his ignorance Aesop finally learns about keywords!
Thus, in front of a ten-page how-to article on how to make his fables more “machine-friendly”, Aesop’s work goes to ashes.
What had happened? To make his content ranked, Aesop decided to convert all the metaphors and fables in keywords!
Thank god this is only a thought experiment and Aesop never made it through our times. Otherwise, we would have lost one of the most important literary works ever created. Yet if keywords were the rule of thumb in the last decades, can we expect to see a paradigm shift anytime soon?
When Keywords Become Metaphors
In the last two decades search engines took over the web and they created a net for their own sake. This net revolved around the use of keywords. How could not be so? To make us find what we were looking for we had to adjust our thinking to fit that little quest box.
That is how we went from thinking out of the box to thinking within that narrow search box. That is also how we moved away from the why and more toward the how of things. The era of tutorials, how-to articles, and how to become a magician in ten steps arose. Is that era is close to an end? In part, it might be.
Since 2013, whenRankBrain got launched, search engines are picking up. In fact, throughNLP (natural language processing) machines now understand human language. Thus, we are finally going back to think out of that little search box into our human consciousness.
For instance, through our digital assistants, we ask more and more questions. In short, rather than using keywords, we now ask questions and we want an answer. The revolution though is that we now want answers not only from Google. But also from the websites that populate the web. Would you ever talk to a website that speaks in keywords? Of course not! Yet while it made sense a few years ago, it doesn’t anymore.
Search engines are becoming better and better at “interpreting” human language. They can extrapolate the information needed to answer our questions. Thus, make those blogs and websites speak to us.
Rather than make our text fit machines’ requirements (as it happened so far), finally authors, bloggers, and content creators canfocus on writing stories. Metaphors and anecdotes, in this era, should be the rule. Rather than focusing all our efforts on keywords, we can finally go back to write for people.
Google meets Aesop
While writing an article you can now focus on inspiring others while software like WordLift does the rest for you. For instance, within this article, I created a set of entities within my vocabulary, which explained to search engines what my article is about. In fact, by adding a schema markup to my article, I put search engines in condition to understand the content. Thus, without placing any keyword within the text I connected the fable to other concepts such as Google and Technology. You can see it from the schema markup I created through WordLift:
Linked metadata describing this article
In the meanwhile, WordLift is also putting things into context by creating a knowledge graph.
The Metadata visualized using LOD View
In other words, in a few seconds and without placing a line of code I achieved those results:
Passed toward search engines the definition of the concepts within the articles trough schema
Contextualized the content of the article through the knowledge graph
In other words, search engines are enabled to realize that The Fox and The Crow is a fable, but it was told in the context of SEO. Thus, by making this article unique.
In the last two decades, a new style of writing was born. A style based on 200-pages long how-to-articles. Most of us caught into this game. We also thought we determined it. The truth is we never chose it. It was imposed on us. In fact, that style was (in part) born from the necessity to make search engines understand our content. Either we did that or we were out of the web. The price to pay was too high. Therefore, we started to write dry content, mainly based on keywords to make sure it was ranked. In the meanwhile, we lost track of ancient wisdom.
Over two thousand years ago people already knew the importance of fictional stories. In the last decade, neuroscience confirmed that our brain likes those stories. In short, the human mind does not want the truth to be given on a silver plate. Our mind wants to dig, interpret, wonder and visualize before getting to the truth.
Now semantic technologies allow us to tell stories, fables, and anecdotes again. The web doesn’t have to be a dry land but it can finally become a place of wonder.
Content Marketing is an ever-evolving arena, which depends upon findability. Think of the great classical works, such as those of Plato and Aristotle. If none would have found them, none would have known of their existence. In other words, when it comes to content, findability is a primary issue. Yet in the past, it was up to men making those contents findable. In the last two decades, things swiftly changed. Findability of content isn’t anymore in men’s hands. In fact, machines took over. Where are we heading toward? Read on…
Back inthe 90s, the web was still a shapeless creature. Almost like a Hydra monster with its many heads, it seemed untameable. Made of millions of pages. Those were all disjointed and disconnected. Either you knew exactly the name of the website you were looking for, or the game was over.
Things didn’t seem to improve a lot. From 1990 to 1994 a wave of search engines tried to tame the multi-headed monster. From Archie to AltaVista the future didn’t look bright.
Then, two young fellows from Stanford conceived an algorithm. Named after his creator, called PageRank. It was the birth of another mythological creature, called Google. Finally, the web seemed tamed. All it took was a simple yet powerful principle. Classify the content based on link popularity.
In short, web pages got ranked and classified by a militia of little crawlers. By wriggling through those pages they accounted for about two hundreds factors. To determine what is relevant and what to throw away. The consequence was that those crawlers became the sentinels of the web. The net they created is the web as we know it.
Thus, Google became the Web. So, if Google did not know you existed de facto you didn’t. That is how the SEO (Search Engine Optimization) industry was born.
Content Marketing in the PageRank Era
Ever since Google managed to index the web it became it. What was the web according to Google though? The answer lied in the ability of crawlers to capture the signals contained in each living page. Thus, you had to adjust and optimize your pages. So crawlers could “perceive” those signals. In this way, you got better chances of having those pages ranked. In short, it didn’t matter if you wrote Dante’s Divine Comedy. Your content would not rank if you didn’t follow a simple and linear formula.
What was that formula? It was a mix of keywords, tags, meta descriptions, internal links, and backlinks. Although this formula still works today, it has lost efficacy over time. And chances are that it’ll become less and less effective until it’ll stop working. Why? A new algorithm came out that changed it all.
RankBrain Changed It All
When PageRank came out, it all made sense. It was a time in which Artificial Intelligence (AI) was not powerful enough. Yet things changed swiftly when in 2013 Google launched a new algorithm, Hummingbird. That algorithm used AI to analyze and understand human language.
In this scenario, back in 2015, RankBrain became one of the most powerful ranking factors. De facto RankBrain is shaping the web like PageRank did back in the late 90s.
How’s the web gonna look like in the RankBrain era?
None knows for sure. Yet one thing we know. Keywords, links or backlinks are (slowly) getting replaced. Now semantics, context, and user experience are shaping the net.
Thus, SEO changed its meaning. It went from Search Engine Optimization to Superb Experience Of User.
When SEO Becomes People’s Engagement
In this scenario, using a set of disjointed tactics, like meta tags, and keywords, may not be enough. Why? RankBrain looks at how users experience each piece of content. Thus, it isn’t anymore about how many keywords or links a web page contains. But rather about how easy to navigate, discover and engaging a content becomes.
In other words, we shifted from link building to information architecture.
How to Become an Architect and Win in the RankBrain Era
With this new paradigm shift, there are few questions to answer. Before deeming any piece of content relevant to users. Make sure you answer “Yes” to those five questions:
Are users dwelling on the page?
Are they navigating the page?
Am I giving answers to their questions?
Am I giving them a clear context on which my website sits?
Is my content unambiguous?
If the answer is Yes, then you’re good to go. Otherwise, I’ll tell you how to get there…
Five main suggestions to thrive in this era:
First, add a schema markup to the content on your website
Why? Imagine you are talking about Steve McQueen. Of course, that may be the actor as well as the Oscar-winning director. Although the reader may understand based on the context of the article, the search engines won’t. Unless you make that content unambiguous. How? By adding the schema.org definition of Steve McQueen. By doing so you will allow search engines to understand who’s Steve McQueen you are referring to.
Second, answer questions as much as you can
Why? We often tend to forget why the web exists after all, which is to answer our questions. Thus, it may be a good idea to set up an internal vocabulary on your website. That will give the definitions of the main terms that set the context of your blog. That will make the content on your website ready to get indexed by Google’s crawlers. Thus shown to users.
Third, create a powerful context by building an internal knowledge graph
Why? That will improve the user experience.
Fourth, write captivating stories
Why? To make your readers stick to them. Keep in mind that each word is a powerful sword that you can use to make the reader engaged.
Fifth, set an editorial strategy around few, strategic concepts
Why? Content marketing is about conversion rates rather than vanity metrics. Stop looking at likes and shares and start looking at how your content is affecting the bottom line. Thus, if you want to succeed create focused and niched down content.
In conclusion, ever since search engines took over the web they created a net that put machines in charge of it. Either you spoke their language or your content got deemed unworthy. Yet a revolution happened when Google launched RankBrain, within Hummingbird. Finally, machines were able to read human language. Thus, you should focus on becoming a better writer and inspire people through great stories.
“What we should do is insist on optimizing our content using semantic SEO practices(emphasis mine), in order to help Google understand the context of our content and the meaning behind the concepts and entities we are writing about.“
To make your content relevant it is crucial to have a change of paradigm.
Keywords are not enough. What can you do? Let’s find out!
How to make machines understand the classics
Athens 399 B.C.: a chubby man, with a long white beard was standing in front of a jury. We are in Athens, the most developed city at the time. Yet that man was ready to be sentenced to death.
He was not afraid, and although he was in a risky situation he spoke his mind until the last instant. That man was Socrates, and that trial is portrayed in Plato’s “Apology of Socrates”.
Even though this is the most moving trial ever told, Google wasn’t able to understand it before Hummingbird unleashed RankBrain!
Indeed, if you were to use the classic approach to SEO, you would stuff your article with keywords (like you would do with a Thanksgiving’s turkey) hoping that one day Google understood it!
For how crazy that sounds, it is what many experts would do! But isn’t there a better way? Yes, that is!
For instance, that is what Wikipedia says about the “Apology of Socrates”:
“The Apology of Socrates, by Plato, is the Socratic dialogue that presents the speech of legal self-defence, which Socrates presented at his trial for impiety and corruption, in 399 BC.
Specifically the Apology of Socrates is a defense against the charges of “corrupting the young” and “not believing in the gods in whom the city believes, but in other daimonia that are novel” to Athens.”
As a human, this text is pretty straightforward. Yet to make it comprehended by machines we have to take an additional step.
Google finally meets Socrates
I took Wikipedia’s text and edited it with WordLift and that is what I got:
First, as soon as I placed the text in my WordPress editor and saved it as a draft, WordLift started to analyze it semantically. In short, WordLift understood what I wrote thanks toNLP.
Second, on the right side, WordLift classified the content of my post under the “What, where, when and who” and extracted the relevant entities. What is an entity? An entity is a page that is structured semantically, thus understood not only by humans but also by search engines.
Third, WordLift suggested a set of entities (such as Apology, Classical Athens, Daemon, Plato, Socrates and Socratic Dialogue) that would help me to tell the story both to humans and machines.
With a click, I selected the entities classified by WordLift and saved the article. WordLift marked my content through schema.org, and made it readable to machines!
How do I know?
Within my editor WordLift makes available a box, that says “View Linked Data”.
Once I click on it, and take an additional step I can see how the information I placed in my editor is reshaped until it became organized knowledge.
In short, the information I wrote in the article was reshaped and organized in a set of nodes and edges. Where the nodes are the articles and entities. While the edges are the relationships between those articles and entities.
Why is that relevant?
That knowledge is now accessible to both humans (in the form of text) and machines (in the form of schema.org markup).
In other words, without placing a single keyword in my post I managed to explain the “Apology of Socrates” to my new friend, Google!
The only caveat is to structure your content by creating Entities rather than keywords!
The Evolution of SEO: from keyword to Entity
Throughout this article, we saw a few very interesting points.
First, humans use questions to communicate. Yet we expect answers that are clear and straightforward. Paradoxically, though, that is not the way the web worked until recently.
Second, machines didn’t understand human language. Yet a revolution happened in 2013, when Hummingbird unleashed RankBrain.
Third, now thanks to semantic web, humans and machines are on the same page. Yet to take advantage of this revolution, you have to stop thinking about keywords and start creating Entities!
Do you want to create your first entity? Get in touch with me!