Linked Data is a simple way to link datasets online and is a key element in the Web of Data.
The Linked Open Data Cloud is a diagram that depicts publicly available linked datasets. The diagram is updated regularly and maintained by the Insight Center for Data Analytics (a joint initiative between researchers at Dublin City University, NUI Galway, University College Cork, University College Dublin and other partner institutions). Starting in June 2018 the first datasets created with WordLift have entered the notorious diagram.
The diagram of the LOD Cloud is published under the Creative Commons Attribution License and it is, therefore, free to use. Since the first edition back in 2007, the diagram has been widely used by researchers and academics around the world to talk about linked data in research papers, posters, and presentations.
Every day (almost) we now hear of new knowledge graphs being implemented: from Bloomberg to Thomson Reuters, from Amazon to AirB&B and it is not just multinational organizations and top-notch publishers, research institutions, governments, libraries and universities from all over the world are constantly contributing, one way or the other, to the expansion of linked data.
The exponential growth of interlinked datasets that conform to the linked data principles introduced by the inventor of the Web Sir Tim Berners-Lee is immediately visible by looking at the evolution of the Linked Open Data Cloud diagram.
Started in May 2007 with only 12 interlinked datasets, year after year, this bubble chart, where each bubble is essentially a new knowledge graph, has witnessed the expansion of the connected knowledge. In July 2018 in the LOD Cloud diagram, there are 1224 datasets (this is an x100 growth since the first edition).
Since January 2017 and after few years during which the chart was not updated, LOD Cloud has been divided into nine subclouds each one representing a separate knowledge domain: geography, government, linguistics, life science, media, publications, social networking, user-generated and cross-domain (anything else, like DBpedia, Wikidata or datasets created with WordLift that span across multiple topics).
How can I contribute to the LOD Cloud?
First, you have to publish data that follows the Linked Data Principles and this means the following:
- Use HTTP URIs to name *things* in your dataset so that others can look at it using the HTTP protocol
- Make sure that, when looking at this URIs, with or without supporting content negotiation, these URIs resolve to RDF data (a standard format to represent interconnected data) in any of the supported format (RDFa, RDF/XML, Turtle, N-Triples)
- The dataset shall contain at least 1.000 facts (these are also called triples or subject-predicate-object statements).
- Much like the hypertextual web in the web of data links are essential to help us discover new things. Make sure that your dataset links to URIs of existing datasets that are already in the LOD Cloud. They require a minimum of 50 links with already existing linked datasets.
- The entire dataset shall be accessible via RDF crawling, RDF dump or a more clever SPARQL end-point.
The team behind the LOD Cloud also requests you to fill up a form online and are extremely kind in letting you submit your dataset. All the diagrams, from the first edition to the latest are available on the LOD Cloud website along with the code used to generate the diagram.
WordLift Knowledge Graphs in the LOD Cloud
We are very happy that, starting with June 2018 edition of the LOD Cloud, datasets created with WordLift are part of the LOD Cloud.
As we are in the process of submitting, on regular basis – new datasets from our users – we are proud to share that the first linked datasets that have made into the diagram are from SalzburgerLand Tourismus (the organization behind the tourism in the region of Salzburg), this same blog (WordLift blog in English and in Italian), and Rainer Edlinger‘s blog “Whiskey circle”: most probably the very first Whisky-centric dataset of the LOD Cloud 🥃.
For each dataset, the number of links to other datasets is also used for the creation of the LOD chart. As most of our dataset, the graph created from this blog is linking to DBpedia, Freebase, Geonames, Yago and Wikidata.
What are the benefits of using Linked (Open) Data?
Semantic Web technologies and Linked Data have the main goal of interconnecting existing and new data available on the web. This essentially means to break the information silos in which information is usually closed and to provide new ways of accessing, validating and using data that comes from different sources. Connected knowledge is also key to infer new knowledge. This is the reason why in 2010 it was, once again, Sir Tim Berners-Lee who identified the 5-stars linked data principles to help organizations (both private and public) understand the importance of publishing linked open data online. 5-stars, in Tim Berners-Lee’s rating, are assigned to datasets published using the RDF format in Linked Open Data and that are interlinked with at least another dataset in the linked data cloud.
In WordLift we have created a workflow for online publishers, bloggers, businesses and editorial teams to democratise semantic technologies and to build knowledge graphs optimised for content publishing, search engine optimisation and semantic search.
Contact us to learn more about LOD Cloud and to start creating your own knowledge graph ⚡️