2023 is the year of the Generative Web (or Text-to-X if you like). The most significant shift in how we create, find, and consume content online. AI-generated content will be indistinguishable from human-generated content and has demonstrated in 2022 the potential to revolutionise journalism, SEO, and content marketing.
“I don’t know what’s going on. I don’t understand anything except the rules for symbol manipulation. Now in this case I want to say that the robot has no intentional states at all; it is simply moving about as a result of its electrical wiring and its program. And furthermore, by instantiating the program I have no intentional states of the relevant type. All I do is follow formal instructions about manipulating formal symbols.”John R. Searle – (1980) Minds, brains, and programs.
We founded WordLift with the belief that websites would eventually be transformed into datasets, and that AI would assist us in navigating the vast global information ecosystem. Our goal is to continue making data accessible and easy to use for everyone, while empowering successful AI projects.
SEO has changed forever as queries that would traditionally go to Google, are being intercepted by ChatGPT. This is happening at the speed of 1 million prompts per day.
“LLMO, or large language model optimization, is a term we coined to refer to ensuring your business information is mentioned within a large language model (LLM). One effective technique for this is in-context learning.”Han Xiao and Alex C-G – Jina AI
More than ever, optimization techniques to increase demand for products and services are critical for businesses. SEO continues to evolve to accommodate the changing nature of search and the increasing use of Conversational AI. In our industry, we have long optimized for AI-specific ranking factors, such as a website’s ability to provide a clear and coherent response to a given search query, or the use of structured data to make it easier for Google to understand and interpret a website’s content, or even how to get into Google’s Knowledge Graph. That was just the beginning. In 2023, things are changing at an unprecedented pace as generative technology delivers a new result for every new user, a new result for every problem, and at web scale. The game has begun and here are my predictions.
In a nutshell:
Automation and augmentation remain closely intertwined, and the ability to experiment with new optimization techniques such as in-context learning, instructions-based prompts, and other emerging behaviours of foundational models remains anchored to an extensible data fabric and brand authority.
We build Knowledge Graphs to structure data, making it more flexible and easier for brands to interact with Generative Tech.
WORKING RESPONSIBLY WITH AI SYSTEMS IS NOT AN OPTION: IT IS AN IMPERATIVE.
As in previous years, I’d like to thank the amazing colleagues I work with, as well as our great clients and investors. These annual SEO trends are not only my predictions, but also the trends WordLift will focus on to ensure the success of our clients and the continued development of our platform.
What Are The Top Trends For SEO In 2023?
Here are the top 5 trends you need to watch in 2023:
- Generative Content
- E-E-A-T and Structured Data
- Multimodal eCommerce
- Intent is king
- Recession mode: game on
1. Generative Content
Generative tech has made significant strides in 2022, with advances in deep learning and natural language processing leading to the development of sophisticated AI systems that we have been able to use on a wide range of SEO tasks.
The current state-of-the-art in many deep learning domains typically relies on three key components:
- Large, scalable architectures (such as those based on the Transformer)
- Head-based transfer learning where a generic head layer takes pre-trained representations to predict an output class
- Prompting where a task-specific pattern string is designed to coax the model into producing a textual output corresponding to a given class.
These models can be applied to a variety of data types, including images, videos, and audio. In the area of AI-powered SEO, some of the most successful models we have used include BERT, RoBERTa, BART, DistillBERT, T5, and GPT-3 trained on billions of tokens of English text using masked language modeling techniques.
I expect a strong acceleration in how external knowledge is incorporated into learning by enhancing prompting. We have tested multiple ways to expand words in prompts from a knowledge base (i.e., a taxonomy). Significant progress has been made on the research front by Liu et al. using an LM to generate relevant knowledge statements in a few-shot setting. Notable is the work of incorporating additional knowledge data by enhancing the training data using entity descriptions and entity types (see Arora et al.) or how in-context learning performance can be improved by providing a better explanation in prompts as proposed by Xi Ye et al.. In layman’s terms, working with curated external data will help LLMs get smarter without sacrificing performance.
We can also foresee that data is becoming the real bottleneck when training LMs. Jordan Hoffmann and the team at DeepMind introduced a 70B LM called “Chinchilla” that outperformed bigger LMs (GPT-3, Gopher) by scaling training data and not exclusively parameters. If adding more data helps improve the performances, we can understand why OpenAI is developing speech recognition systems like “Whisper” that can feed the otherwise severely starved and under-trained LMs with trillions of text tokens from YouTube or any Podcast available. More than general-domain data is needed; this will also apply to fine-tuning our custom models; I expect we will source more information from existing multimedia content.
Under the pressure of more users moving to ChatGPT for different queries, Google will soon introduce its conversational AI machinery. I expect Google’s attempt to take advantage of both the web index and E-E-A-T signals.
By experimenting in 2022 with LLMs in SEOs, we have learned a few lessons that are shaping our way forward:
- We need to focus on fact-oriented information to bring value to the end user. Only by improving factual accuracy (Truthful AI), we can future-proof AI-generated content as Google improves its spam detection algorithms. The highest cost lies in building validation pipelines that can provide any brand with accurate information and the proper writing style and tone of voice.
- Fine-tuning and in-context learning are powerful approaches that can be made available to content creators and SEOs with effective feedback loops. When we adequately engaged the content team, we saved time and significantly improved quality.
- Prompt improvements can be obtained by comparing the visual similarity of synthetic images produced using text-to-image and CLIP. This is also an exciting line of research to follow in the coming months (see the visual entitlement work by Song et al.)
- We are not Jasper (or any other off-the-shelf content generation tool that supports content writing); we are an SEO automation platform leveraging semantic data: to make an impact, we must wear our SEO hat first.
The Generative Web is changing the way content is delivered to the end user; content exists in its unique form once it reaches the target audience. However, the need to organize content using entities remains. For example, look at the chain presented in the following thread and the limitations of AI systems that do not use knowledge graphs.
We can see the same issue using the newly introduced chatbot (YouChat) on You.com. It is terrific and plausible, but unable to understand what it is talking about.
In-Context Learning Explained
Here is an example of how injecting factual information in the prompt helps fine-tuning the result.
- Ask yourself what data can make your prompts unique. In-context learning requires a clear understanding of entities; we need facts (triples) to add more context while prompting the model. The Generative AI starts here.
- Focus on strengthening the content validation pipeline; ensure that the tone of voice is always suitable for the brand.
- Be responsible when using AI and find the best feedback loop by joining forces with domain experts and content creators.
2. E-E-A-T and Structured Data
Google’s content quality evaluation will get stricter as more synthetic content is created. LLMs will also be affected; their quality is at risk as they will be fed their own content from now on. Brand equity will grow in importance in 2023. Performance marketing, social media networks, and online advertising will lose traction as we move forward (TikTok might even get banned soon in the US).
The proliferation of AI content adds noise to the searcher experience, and only strong brands will succeed and take full advantage of the opportunities offered by generative tech. Looking at the structured data types in your industry and building a solid plan will strengthen your brand. We have seen the importance (for eCommerce websites) of sophisticated Product and Review markup. It is best to have an error-free structured data implementation that guides the crawler and establishes the authority for the brand and its ambassadors (editors, content authors, and domain experts).
The power of a brand will also depend on its ability to publish connected data at scale. Having structured and linked data, using identifiers, and having other data sources and websites citing products, places, events, and people is no longer a niche experiment; it is becoming a profitable strategy for many brands.
- Focus on your authority by connecting your data with other data.
- Design a flawless structured data strategy and follow Google’s quality guidelines for content. They will be a lifesaver as you will embrace AI Content.
- Set up your off-site Structured Linked Data plan; it’s link-building on steroids.
3. Multimodal eCommerce
SEO for eCommerce has evolved significantly in 2022, and Google has put greater emphasis on user experience and quality content to gain traction over Amazon. Additionally, the use of artificial intelligence in its algorithms (like the product review algorithm) and the introduction of the product knowledge graph panels require merchants to adapt their strategies to the evolving trends constantly. Even before the opening of ChatGPT and the “code red” that Google is issuing to respond to OpenAI and Microsoft, eCommerce searches had become more conversational and multimodal.
This has led to concrete opportunities for brands investing in creating a suitable knowledge base of question-and-answer pairs associated with products and product categories. Our success in automating the generation of FAQ content for eCommerce websites has been significant, with thousands of new snippets that brought double-digit sales growth. Our product here will continue to improve to be brighter at sifting through queries, fact-checking answers, and interacting with editors and SEOs.
With Google Lens receiving more than 8 billion queries per month, developing robust product visuals for the different SKUs is also becoming an imperative. MUM is naturally anticipating the next step on the searcher journey and images are, for most products, a key element.
We also must consider the interplay between Google Lens and local search, as users can now search images and text combined to find products from local retailers. It’s an excellent opportunity to show the importance of a solid structured data strategy emphasising data linking.
As we convert products into entities for the product knowledge graph, I can also see the value of improving the semantic annotations for images and bringing full support for 3D models recently introduced in Google search for sneakers and other product types.
The drivers behind Multimodal eCommerce are also represented by the evolution of smart glasses and AR/VR experiences that are slowly entering the market.
Invest in AI workflows to develop comprehensive product images, improve image resolution when needed, experiment with text-to-image and image-to-image foundational models to build the proper context for your products, and work on image annotations. A robust strategy for image annotation will allow your organization to gain additional visibility on search and help you fine-tune contrastive models like CLIP for visual inference.
4. Intent Is King
Foundational model-based recommendation systems accelerate content discovery by providing users with personalized content based on their search history and interests. I expect an increase in queryless traffic as users will need less to enter specific keywords or phrases to find relevant content. Instead, the recommendations are generated by pre-emptively predicting the user intent.
As seen in previous years, traffic from Google Discover will represent a valuable stream of clicks, as it originates from users actively interested in the recommended content. Intent detection using LLMs will enable a new wave of recommendation tools besides Google Discover and Microsoft Start. The ability to interact more directly with search engines like You.com will let publishers and brands craft the user experience when their core entities are being searched.
- Develop new traffic streams, optimize for Google Discover, and look for emerging platforms to expose content and products in new and innovative ways.
- Building apps inside Social Networks, Search Engines, or Chatbot platforms will be an important trend to follow.
5. Recession Mode: Game On
After a sustained period of slow economic growth, the end of the pandemic, and the war in Ukraine, we are facing a global recession with high unemployment, low consumer confidence, and declining business investment.
The economic hardship and the decline in investment in risky assets have also led to the cryptocurrency crisis. These are difficult times, but I am confident that the digital economy in general will show some resilience. As they cut advertising spending, more companies will also turn to organic growth. In the SEO space (or the new LLMO arena where we optimize data for GPT-4 and the like), we need to focus on two things: 1) doing more with less through automation and 2) investing in solid reporting.
Digital marketing needs to prove its value to the bottom line, and we can leverage AI for better SEO predictions, causal impact analysis, and A/B testing. If we can not deliver at least a 3x increase in value (for every dollar spent, our approach or technology brings in at least $3 in revenue), we are not the right partner. To be successful, you must constantly introduce new ideas and strategies. If you do not innovate, you are much more likely to become irrelevant. 2023 is the year of the productivity boost. We can quickly solve problems that slow us down with just a few lines of code, a small prompt, or no code at all. It’s a dream come true, and it’s enabling a new wave of small tools that will change the way we work forever.
- Improve your reporting skills and tooling, look extensively at the business impact that SEO and digital marketing creates. As less investment can be made, remain lean and embrace open innovation. Companies will need a stream of innovation.
- Be ready to incorporate ideas, technologies, and knowledge from outside your organization.
Are you ready to innovate on SEO in 2023? Still have a question? Book a call with us and join our list of happy customers!
- ACL 2022 Highlights by SEBASTIAN RUDER – Jun 2022
- chinchilla’s wild implications by nostalgebraist – LESSWRONG – Jul 2022
- Google’s New Visual Search Tool Plays to Fashion Crowd by LUKE LEITCH – Vogue – Aug 2022
- Generative AI Research – Base 10 – Nov 2022
- SEO is Dead, Long Live LLMO by Han Xiao and Alex C-G – Dec 2022
- 2022: The year that changed the way we work by Cassie Kozyrkov – Dec 2022
Must Read Content
The Power of Product Knowledge Graph for E-commerce
Dive deep into the power of data for e-commerce
Why Do We Need Knowledge Graphs?
Learn what a knowledge graph brings to SEO with Teodora Petkova
Generative AI for SEO: An Overview
Use videos to increase traffic to your websites
SEO Automation in 2023
Improve the SEO of your website through Artificial Intelligence
Touch your SEO: Introducing Physical SEO
Connect a physical product to the ecosystem of data on the web