WordLiftNG – Consortium workshop in Salzburg

WordLiftNG – Consortium workshop in Salzburg

On July 5th and 6th the team at WordLift reached Salzburg to meet the Consortium partners  for the WordLift Next Generation project. As you may know, WordLift received a grant from the EU to bring its tools to any website and to perfect anyone’s semantic SEO strategy [link].

WordLift Next Generation is made up of WordLift, Redlink GmbH, SalzburgerLand Tourismus GmbH and the Department of Computer Science of the University of Innsbruck.The aim is to  develop a new platform to deliver Agentive SEO technology to any website, despite its CMS, and to concretely extend the possibilities of semantic SEO in a digital strategy. The work started in January 2019 and is being developed in a 3-year timeframe.

The project started with a clear definition of the roles and responsibilities of each partner in the scope to improve WordLift’s backend to enrich RDF/XML graphs with semantic similarity indices, full-text search and conversational UIs. The main goals for the project are: being able to add semantic search capabilities to WordLift, improved content recommendations on users’ websites and the integrations with personal digital assistants.


Here is an example of the initial testing done with Google’s team on a mini-app inside Google Search and the Google Assistant that we developed to experiment with conversational user interfaces.

Artificial Intelligence is leading big companies’ investments, and WordLift NG aims at bringing these technologies to small/mid sized business owners worldwide.

In order to accomplish these ambitions goals the consortium is working on various fronts:

  • Development of new functionalities and prototypes
  • Research activities 
  • Community engagement

What is WordLift NG?

The study for the new platform for WordLift’s clients started with a whole new user journey, including an improved UX. New features, developed with content analysis, include automatic content detection, specific web page markup, and knowledge graph curation.

The main benefit of this development is to exploit long tail queries to be featured as speakable (voice) tagging for search helpers. The new backend lets users create dedicated landing pages and applications with direct queries in GraphQL.

The new platform also provides users with useful information about data shared in the Knowledge Graph, including traffic stats, Structured Data reports. 

The development of NG involves features to bring WordLift’s technology to any CMS, the new on-boarding WebApp is part of that.

Envisioning a brand new customer journey, built to help publishers get their SEO done with a few easy steps, the new on-boarding web app guides the setup for users not running their websites on WordPress.

The workshop held various demos for applications developed by partners.

WordLift showcased a demo for semantic search, a custom search engine that uses deep learning models while providing relevant results when queried.
With semantic search one can immediately understand the intent behind customers and provide significantly improved search results that can drive deeper customer engagement.

This feature would have a major impact on e-commerces.

WordLift applied semantic search to the tourism sector and to e-commerce.

Find out more about Semantic Search here


Application of Semantic Search to e-commerce

Austria In Pictures

The backend of the new platform

The backend of the platform is being developed by RedLink, on state-of-the-art technologies.

The services are issued within a cloud microservice architecture which makes the platform scalable, and the data that today are published with WordLift is hosted on Microsoft Azure.

It will provide specific services and endpoints for the text analysis (Named Entity Recognition and Linking) data management (public and private knowledge graphs and schema, data publication), search (full text indexing) and conversation (natural language understanding, question answering, voice conversation) for its users.

GraphQL was selected to facilitate access to data in the Knowledge Graph by developers. Tools for interoperability between GraphQL and SPARQL were also tested and developed.

Data curation and Validation

An essential phase of the project is the collection of requirements and state of the art analysis in order to guarantee a perfectly functioning and functional product. 

The requirements for WordLift NG were developed by SLT, the partner network and STI.

The results of the research activities are aimed at:

  • enhancement of content analysis – read more about this here
  • analysis of the methodologies for the identification of similar entities (in order to allow, on the basis of the analysis of the pages in the SERP, the creation or optimization of relevant content already present in the Knowledge Graph)
  • the algorithms for data enrichment in the Knowledge Graph and reconciliation with data from different sources

On this front, the Consortium is currently working on the definition for SD verification and validation of Schema markup.

This phase will ensure all statements are:

  • Semantically correct
  • Corresponding to real world entities (annotations must comply with content present and validated on the web)

and that annotations are compliant with given Schema definitions (correct usage of schema.org vocabulary).

Community Engagement

Part of the project is the beta testing and the presentation of the platform to WordLift’s existing customers.

We’d be thrilled to share our findings on structured data with the SEO community and very happy to support open sharing of the best practices also from Google and Bing (both seem to be willing to share in machine-readable format the guidelines for structured data markup).

WordLift is an official member of the DBPedia community and will actively contribute to DBpedia Global. Find out more.

If you’re willing to know more about it, and to be included in the testing of the new features contact us or subscribe to our newsletter!

What is Semantic Search, and how does it work?

What is Semantic Search, and how does it work?

If you work in SEO, you have been reading about Google and Bing becoming semantic search engines but, what does Semantic Search really mean for users, and how things work under the hood?

Semantic Search helps you surface the most relevant results for your users based on search intent and not just keywords.

Semantic (or Neural) Search uses state of the art deep learning models to provide contextual and relevant results to user queries. When we use semantic search we can immediately understand the intent behind our customers and provide significantly improved search results that can drive deeper customer engagement. This can be essential in many different sectors but – here at WordLift – we are particularly interested in applying these technologies to: travel brands, e-commerce and online publishers.  

Information is often unstructured and available in different silos, using semantic search our goal is to use machine learning techniques to make sense of content and to create a context. When moving from syntax (for example how often a term appears on a webpage) to semantics, we have to create a layer of metadata that can help machines grasp the concepts behind each word. Google defines this ability to connect words to concepts as “Neural Matching” or *super synonyms* that help better match user queries with web content. Technically speaking this is achieved by using neural embeddings that transform words (or other types of content like images, video or audio clips) to fuzzier representations of the underlying concepts.

As part of the R&D work that we’re doing, in the context of the EU-cofounded project called WordLift Next Generation, I have built the prototype using an emerging open-source framework called Jina AI and the beautiful photographic material published by Salzburgerland Tourismus (also a partner in the Eurostars research project) and Österreich Werbung 🇦🇹 (Austrian National Tourist Office).

I have created this first prototype:

  • ☝️ to understand how modern search engines work;
  • ✌️ to re-use the same #SEO data that @wordliftit publishes as structured *linked* data for internal search.

How does Semantic Search work?

Bringing structure to information, is what WordLift does by analyzing textual information using NLP and named entity recognition, and now also images using deep learning models.

With semantic search, these capabilities are combined to let users find exactly what they need naturally.

In Jina, Flows are high-level concepts that define a sequence of steps to accomplish a task. Indexing and querying are two separate Flows; inside each flow, we run parallel Pods to analyze the content. A Pod is a basic processing unit in a Flow that can run as a dockerized application.

This is strategic as it allows us to distribute the load efficiently. In this demo, Pods are programmed to create neural embeddings: one pod to processes text and one for images. Pods can also run in parallel and the results (embeddings from the caption and embeddings from the image) are combined into one single document.

This ability to work with different content types is called multi-modality.

The user uses a text in the query to retrieve an image or vice-versa; the user uses an image, in the query, to retrieve its description.

See in the example below; I make a search using natural language at the beginning and right after, I send an image (from the results of the first search) as query to find its description 👇

Are you ready to innovate your content marketing strategy with AI? Let’s talk!

What is Jina AI?

Han Xiao, Jina AI’s CEO, calls Jina the “TensorFlow” for search 🤩. Besides the fact that I love this definition, Jina is completely open source, and designed to help you build neural (or semantic) search on the cloud. Believe me it is truly impressive. To learn more about Jina, watch Han’s latest video on YouTubeJina 101: Basic concepts in Jina“.

How can we optimize content for Semantic Search?

Here is what I learned from this experiment:

  1. When creating content, we shall focus on concepts (also referred to as entities) and search intents rather than keywords. An entity is a broader concept that groups different queries. The search intent (or user intent) is the user’s goal when making the query to the search engine. This intent can be expressed using different queries. The search engines interpret and disambiguate the meaning behind these queries by using the metadata that we provide.
  2. Information Architecture shall be designed once we understand the search intent. We are used to thinking in terms of 1 page = 1 keyword, but in reality, as we transition from keywords to entities (or concepts), we can cover the same topic across multiple documents. After crawling the pages, the search engine will work with a holistic representation of our content even when it has been written across various pages (or even different media types).
  3. Adding structured data for text, images, and videos adds precious data points that will be taken into account by the search engine. The more we provide high-quality metadata, the more we help the semantic search engine improve the matching between content and user intent.
  4. Becoming an entity in Google’s Knowledge Graph also greatly helps Google understand who we are and what we write about. It can have an immediate impact across multiple queries that refer to the entity. Read this post to learn more how to create an entity in Google’s graph

Working with Semantic Search Engines like Google and Bing, require an update of your content strategy and a deep understanding of the principles of Semantic SEO and machine learning.