Writing is the representation of language in a textual medium through the use of a set of signs or symbols (known as a writing system). It is distinguished from illustration, such as cave drawing and painting, and non-symbolic preservation of language via non-textual media, such as magnetic tape audio.



What is an entity in the Semantic Web?

In the Semantic Web an entity is the “thing” described in a document. An entity helps computers understand everything you know about a person, an organization or a place mentioned in a document. All these facts are organized in statements known as triples that are expressed in the form of subject, predicate, and object.

Take for example an article on the Chinese invasion of Tibet that refers to the prophecies that [Thubten Gyatso] made twenty years before the invasion. [Thubten Gyatso] as a string is composed by two separate words but as an entity it has a much richer meaning. [Thubten Gyatso] is a person, more specifically he’s the 13th Dalai Lama who was born in the Tsang-Ü province in Tibet on the 12th of February 1876. In the context of Semantic Web, an article annotated with the entity of [Thubten Gyatso], conveys all this information about the 13th Dalai Lama in a way that computers can understand it and, as you might suspect, computers might not be too familiar with the history of Tibetan Buddhism.

Every document on the Web is about many different kind of “things”. Entities describe content using knowledge models known as graphs that help computers “think” the way we do and that help us, in return, find information more efficiently.

Entities are linked to one another. Each entity holds the information required to provide direct answers to questions about itself (i.e. “When was Thubten Gyatso born?”) and questions that can be answered by looking at the relationships with other entities (i.e. “Was Trinley Gyatso his predecessor?”).

What is an Entity in WordLift?

Entities in WordLift are web pages that describe the “things” that we talk the most in our website. All the entities are organised in a vocabulary within WordPress. Each entity is a web page and corresponds to a data point that WordLift creates in the web of data.

WordLift publishes entities and their properties in an intelligent model — technically called “graph” — designed to help computers understand real-world “things” and their relationships to one another. The graph is published using linked data and it is used, in WordLift, to enrich all content published on a website.

Structured Data

Let’s take this page as an example. This is the entity page to describe what entities are. You are reading content from this webpage, a hypertext document that is connected to the World Wide Web. At any time a crawler, a smart agent or a chatbot can read this same information by looking at the structured data that WordLift has created for this entity.

While humans can read a web document, for a computer is way easier to read semantically-rich data linked to other data published in openly available datasets.

How is an entity different from an article or a web page?

WordLift uses Entities in three ways:

  • Entities describe the “things” that you talk about in your articles using 5-stars linked data so that search engines can unquestionably understand what you’re writing about
  • Entities help organize the content that you’re writing. As you annotate an article with an entity, WordLift creates a relationship between the article and entity in such a way that a computer can understand it. These relationships are stored in the graph of the website and are used to provide meaningful recommendations to your readers
  • Entities provide contextual information to the audience. Take for example Linked Data – this is a concept that I used in this article and you might not be familiar with. In this case, WordLift helps me create a link so that you can find out more what linked data is and avoid jumping on another website to get the same information.

Entities have to be relevant to the content that you’re writing and in a way define your content strategy and the knowledge domain you’re addressing with your website.

What are the guidelines for creating new entities to annotate a blog post or a page?

A basic guideline for adding a new entity is:

I should create entities that a librarian would plausibly use to classify the content I am writing as if it was a book

In some cases key concepts that are important for our audience are not automatically detected by WordLift. In this case, we can create them and teach WordLift – as well as search engines, so that they will be able to recognise them in the future.

Let me give you an example. When a new concept was introduced to describe PASO an acronym for Personal Assistant Search Optimisation, I created a new entity on this website and described it using structured data with WordLift.

As you can see in the video below, the entity, after few weeks, became a featured snippet. By doing so Google Home was able to provide a simple definition of PASO using the content from this same website.

I have already several articles that could be used to organize the content on my website, can I turn them into entities?

Yes, you can now convert your existing articles or pages into entities with a simple click. This helps you reuse your cornerstone articles to reorganize the content on your website and improve the search rankings of these pages.

Cornerstone articles are usually meant to describe the “things” you care the most and are a perfect match for becoming entities. 

Search engine optimization

Search engine optimization

Search engine optimization (SEO) is a branch of web marketing which is aimed to improve the visibility of a website or a web page in a search engine’s organic (meaning un-paid) search results (which are collected in a SERP).

What kind of tactics does SEO use?

SEO activities are divided into two main categories:

  1. On-page SEO: all the tactics used to gain better SEO results working on a website’s content and on its code.
  2. Off-page SEO: all the link-building tactics used to gain valuable links from other authoritative websites.

What kind of improvements are expected from SEO?

As a result of SEO you can expect three kinds of results:

  1. Organic results: SEO helps your website gaining best positions on strategically relevant SERPs
  2. Quantity of traffic: as a result, organic traffic will grow and so you will gain much more visits on your website
  3. Quality of traffic: as long as your SEO strategy is coherent with your content and with your business, the users who visit your website will be truly interested in the services or products you offer.

To learn more, watch our webinar

Here at WordLift, we have studied, written and experimented a lot about this subject. If you need more tips, don’t miss our webinar on machine-friendly content with Scott Abel. It’s free and you can watch it anytime. Just grab a pen and a scratch pad. 📝

Watch it now!

Natural language processing

Natural language processing

What is natural language processing?

Natural language processing (or NLP) is a field of computer science, artificial intelligence, and linguistics that has to do with the interactions between computers and humans using natural languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve natural language understanding — that is, enabling computers to derive meaning from human or natural language input.

How does WordLift uses NLP?

WordLift suggests to content editors relevant fact-based information, images and links by analysing content being written (either pages or post).

WordLift uses Named Entity Recognition (NER) and Named Entity Disambiguation (NED) to extract Named Entities from textual contents.

Editors can reconcile entities extracted from their posts and pages with equivalent entities available on other sources (i.e. DBpedia or Wikidata). By automatically linking entities WordLift helps machines unambiguously interpret the context of the content being written.

This information, derived from these large open graphs such as DBpedia, is also used by WordLift to add the semantic markup using the vocabulary of schema.org.



What is Schema.org markup?

Schema.org is an initiative launched in 2011 by the world’s biggest search engines (Bing, Google and Yahoo!) to implement a data schema structure to describe web pages. On 1 November 2011 Russian largest search engine Yandex also joined the community.

Schema.org is the first shared vocabulary that webmasters can use to structure metadata on their websites and to help search engines understand the content being published.

You can think of the schema.org vocabulary as the “lingua franca” for search engines; a universal way to describe web pages with structured data. Search Engines use these pieces of information to enrich the user experience on their search results and to generate rich snippets (small data-driven widgets).

Metadata written using schema.org can be extracted and processed from web pages by search engines, web crawlers, and smart agents to provide a richer browsing experience for users. 

In December 2016 also Apple began recommending web developers to mark up web contents using schema.org to help the Applebot web crawler index their content and making it available to all iOS users in Spotlight and Safari search results.

How can I add Schema.org markup to my website?

Schema.org markup can be added using various tools available online – including the Google’s Structured Data Markup Helper – or directly by adding the code to your web pages. There are different formats that can be used to add information to your Web content implementing the schema.org vocabulary, such as Microdata, RDFa, and JSON-LD.

If you are using WordPress you can use WordLift: a plugin for WordPress that helps content editors and website owners markup their content with schema.org markup, without requiring any technical skills. In fact, thanks to natural language processing, WordLift does it automatically.

Schema.org is a vocabulary of concepts, widely used all over the web and made up of more than 1.200 attributes organized in primary types: Thing, Action, Creative Work, Event, Intangible, Medical Entity, Organization, Person, Place, Product.

WordLift allows you to structure your websites around: Things, Creative Works, Events, Organizations, Local Businesses, People and Places. These types are grouped by the plugin in the 4Ws (WHAT, WHERE, WHEN and WHO) as followed:

  • WhoPerson, Organization, Local Business
  • WherePlace
  • WhenEvent
  • WhatCreative Work, Thing