Select Page
SEO Automation in 2024

SEO Automation in 2024

SEO automation is the process of using software to optimize a website’s performance programmatically. This article focuses on what you can do with the help of artificial intelligence to improve the SEO of your website. 


Let’s first remove the elephant in the room: SEO is not a solved problem (yet), and while we, as toolmakers, struggle to alleviate the work of web editors on one side while facilitating the job of search engines on the other, SEO automation is still a continually evolving field, and yes, a consistent amount of tasks can be fully automated, but no, the entire SEO workflow is still way too complicated to be entirely automated. There is more to this: Google is a giant AI, and adding AI to our workflow can help us interact at a deeper level with Google’s giant brain. We see this a lot with structured data; the more we publish structured information about our content, the more Google can improve its results and connect with our audience. 

In the realm of SEO automation, a pivotal role is played by AI SEO Agents. These intelligent systems are designed to streamline and enhance various SEO tasks through automation. An AI SEO Agent can significantly reduce the manual effort involved in SEO by taking over repetitive and time-consuming tasks. For instance, it can conduct comprehensive keyword research, identifying not just high-volume keywords but also long-tail phrases that might be overlooked. This ensures that your content strategy is both broad and nuanced, capturing a wider audience.

Moreover, an AI SEO Agent excels in content optimization. It can analyze existing content to suggest improvements, such as where to naturally incorporate keywords, how to structure articles for better readability, and even recommend topics that are currently trending or missing from your content strategy. This ensures that your website remains relevant and highly engaging for your target audience.

Another critical area where AI SEO Agents make a significant impact is link building. By analyzing vast amounts of data, these agents can identify potential high-quality backlink opportunities. This not only enhances your site’s authority but also drives targeted traffic, contributing to both your SEO and overall marketing goals.

By integrating an AI SEO Agent into your SEO workflow, you’re not just automating tasks; you’re also aligning your strategies more closely with the sophisticated algorithms of search engines like Google. This synergy between AI-driven SEO efforts and Google’s AI can lead to more effective and efficient optimization, ultimately improving your website’s visibility and engagement.

This blog post is also available as Web Story ?  “SEO Automation Web Story

An introduction to automatic SEO 

When it comes to search engine optimization, we are typically overwhelmed by the amount of manual work that we need to do to ensure that our website ranks well in search engines. So, let’s have a closer look at the workflow to see where SEO automation can be a good fit.

  1. Technical SEO: Analysis of the website’s technical factors that impact its rankings, focusing on website speed, UX (Web Vitals), mobile response, and structured data.
    • Automation: Here, automation kicks in well already with the various SEO suites like MOZ, SEMRUSH, and WooRank, website crawling software like ScreamingFrog, Sitebulb, etc., and a growing community of SEO professionals (myself included) using Python and JavaScript that are continually sharing their insights and code. If you are on the geeky side and use Python, my favorite library is advertools by @eliasdabbas ? .
  2. On-Page SEO: Title Tag, Meta Descriptions, and Headings.
  3. Off-page SEO: Here, the typical task would be creating and improving backlinks.
    • Automation: Ahrefs backlink checker is probably among the top solutions available for this task. Alternatively, you can write your Python or Javascript script to help you claim old links using the Wayback machine (here is the Python Package that you want to use).
  4. On-site search SEO: the Knowledge Graph is the key to your on-site search optimization.
    • Automation: Here we can create and train a custom Knowledge Graph that makes your on-site search smarter. So, when a user enters a query, the results will be more consistent and respond to the user’s search needs. Also, through Knowledge Graph, you will be able to build landing page-like results pages that include FAQs and related content. In this way, the user will have relevant content, and their search experience will be more satisfying. By answering users’ top search queries and including information relevant to your audience, these pages can be indexed on Google, also increasing organic traffic to your website.
  5. SEO strategy: Traffic pattern analysis, A/B testing, and future predictions.
    • Automation: here also we can use machine learning for time series forecasting. A good starting point is this blog post by @JR Oaks. We can use machine learning models to predict future trends and highlight the topics for which a website is most likely to succeed. Here we would typically see a good fit with Facebook’s library Prophet or Google’s Causal Impact analysis.

Will Artificial Intelligence Solve SEO?

AI effectively can help us across the entire SEO optimization workflow. Some areas are, though, based on my personal experience, more rewarding than others. Still, again – there is no one size fits all and, depending on the characteristics of your website, the success recipe might be different. Here is what I see most rewarding across various verticals.

Automating Structured Data Markup

Structured data is one of these areas in SEO where automation realistically delivers a scalable and measurable impact on your website’s traffic. Google is also focusing more and more on structured data to drive new features on its result pages. Thanks to this, it is getting simpler to drive additional organic traffic and calculate the investment return.

ROI of structured data
Here is how we can calculate the ROI of structured data

Here is a concrete example of a website where, by improving the quality of structured data markup (on scale, meaning by updating thousands of blog posts), we could trigger Google’s Top stories, to create a new flow of traffic for a news publisher. 

Finding new untapped content ideas with the help of AI 

There are 3.5 billion searches done every day on Google, and finding the right opportunity is a daunting task that can be alleviated with natural language processing and automation. You can read Hamlet Batista’s blog post on how to classify search intents using Deep Learning or try out Streamsuggest by @DataChaz to get an idea. 

Here at WordLift, we have developed our tool for intent discovery that helps our clients gather ideas using Google’s suggestions. The tool ranks queries by combining search volume, keyword competitiveness, and if you are already using WordLift, your knowledge graph. This comes in handy as it helps you understand if you are already covering that specific topic with your existing content or not. Having existing content on a given topic might help you create a more engaging experience for your readers.  

Here is a preview of our new ideas generator – write me to learn more

We give early access to our upcoming tools and features to a selected number of clients. Do you want to join our VIP Program?

Automating Content Creation 

Here is where I expect to see the broadest adoption of AI by marketers and content writers worldwide. With a rapidly growing community of enthusiasts, it is evident that AI will be a vital part of content generation. New tools are coming up to make life easier for content writers, and here are a few examples to help you understand how AI can improve your publishing workflow. 

Creating SEO-Driven Article Outlines

We can train autoregressive language models such as GPT-3 that use deep learning to produce human-like text. Creating a full article is possible, but the results might not be what you would expect. Here is an excellent overview by Ben Dickson that demystifies AI in the context of content writing and helps us understand its limitations.  

There is still so much that we can do to help writing be more playful and cost-effective. One of the areas where we’re currently experimenting is content outlining. Writing useful outlines helps us structure our thoughts, dictates our articles’ flow, and is crucial in SEO (a good structure will help readers and search engines understand what you are trying to say). Here is an example of what you can actually do in this area. 

I provide a topic such as “SEO automation” and I get the following outline proposals:

  • What is automation in SEO?
  • How it is used?
  • How it is different from other commonly used SEO techniques?  

You still have to write the best content piece on the Internet to rank, but using a similar approach can help you structure ideas faster.  

Crafting good page titles for SEO

Creating a great title for SEO boils down to: 

  1. helping you rank for a query (or search intent);
  2. entice the user to click through to your page from the search results.

It’s a magical skill that marketers acquire with time and experience. And yes, this is the right task for SEO automation as we can infuse the machine with learning samples by looking at the best titles on our website. Here is one more example of what you can do with this app. Let’s try it out. Here I am adding two topics: SEO automation and AI (quite obviously). 

The result is valuable, and most importantly, the model is stochastic, so if we try the same combination of topics multiple times each time, the model generates a new title.

Improving an existing title by providing a target keyword

You can optimize the titles of your existing content by providing a target keyword in order to rank on Google and search engines based on it. For example, suppose I take “SEO automation” as my target keyword for this article and I want to optimize my current content title. Here is the result.

Generating meta descriptions that work

Also, we can unleash deep learning and craft the right snippet for our pages or at least provide the editor with a first draft to start with for meta description. Here is an example of an abstractive summary for this blog post.  

Creating FAQ content on scale 

The creation of FAQ content can be partially automated by analyzing popular questions from Google and Bing and providing a first draft response using deep learning techniques. Here is the answer that I can generate for “Is SEO important in 2021?”

Data To Text in German

By entering a list of attributes, you can generate content in German. For example, in this case, I’m talking about The Technical University of Berlin, and I’ve included a number of attributes that relate to it and this is the result.

DISCLAIMER: Access to the model has been recently opened to anyone via an easy-to-use API and now any SEO can find new creative ways to apply AI to a large number of useful content marketing tasks. Remember to grab your key from OpenAI.

AI Text Generation for SEO: learn how to train your model on a generic data-to-text generation task. 

Automating SEO Image Resolution

Images that accompany the content, whether news articles, recipes, or products, are a strategic element in SEO that is often overlooked.

In multiple formats (1:1, 4:3 and 16:9), large images are needed by Google to present content in carousels, tabs (rich results across multiple devices) and Google Discover. This is done using structured data and following some essential recommendations:

  • Make sure you have at least one image for each piece of content.
  • Make sure the images can be crawled and indexed by Google (sounds obvious but it’s not).
  • Ensure the images represent the tagged content (you don’t want to submit a picture of roast pork for a vegetarian recipe ?).
  • Use a supported file format (here’s a list of Google Images supported file formats).
  • Provide multiple high-resolution images that have a minimum amount of pixels in total (when multiplying one size with the other) of:
    • 50,000 pixels for Products and Recipes
    • 80,000 pixels for News Articles
  • Add the same image in the structured data in the following proportions: 16×9, 4×3, and 1×1.

AI-powered Image Upscaler

With Super-Resolution for Images, you can enlarge in and enhance images from your website using a state-of-the-art deep learning model.

WordLift automatically creates the required version of each image in the structure data markup in the proportions 16×9, 4×3 and 1×1. The only requirement is that the image is on the smaller side by at least 1,200 pixels.

Since this isn’t always possible, I came up with a workflow and show you how it works here.

Use this Colab

1. Setting up the environment

The code is fairly straightforward so I will explain how to use it and a few important details. You can simply run the steps 1 and 2 and start processing your images.

Prior to doing that you might want to choose if to compress the produced images and what level of compression to apply. Since we’re working with PNG and JPG formats we will use the optimize=True argument of PIL to decrease the weight of images after their upscaling. This option is configurable as you might have already in place on your website an extension, a CDN or a plugin that automatically compresses any uploaded image. 

You can choose to disable (the default is set to True) the compression or change the compression level using the form inside the first code block of the Colab (1. Preparing the Environment). 

2. Loading the files

You can upload the files that you would like to optimize by either:

  1. A folder on your local computer
  2. A list of comma separated image URLs
    In both cases you are able to load multiple files and the procedure will keep the original file name so that you can simply push them back on the web server via SFTP.
  3. When providing a list of URLs the script will first download all the images in a folder, previously created and called input.

Once all images have been downloaded you can run the run_super_res() function on the following block. The function will first download the model from TF Hub and then will start increasing the resolution of all the images x4. The resulting images will be stored (and compressed if the option for compression has been kept to True) in a folder called output.

Once completed you can zip all the produced files contained in the output folder by executing the following code block. You can also change the name of the zipped file and eventually remove the output folder (in case you want to run it again).

To discover the results achieved in our first test of this AI-powered Image Upscaler, we recommend reading our article on how to use the Super-Resolution technique to enlarge and enhance images from your website to improve structured data markup by using AI and machine learning.

Automating product description creation

AI writing technology has made great strides, especially in recent years, dramatically reducing the time it takes to create good content. But human supervision, refinements, and validations remain crucial to delivering relevant content. The human in the loop is necessary and corresponds to Google’s position on machine learning-generated content, as mentioned by Gary Illyes and reported in our web story on machine-generated content in SEO.

Right now our stance on machine-generated content is that if it’s without human supervision, then we don’t want it in search. If someone reviews it before putting it up for the public then it’s fine.

GPT-3 for e-commerce

GPT-3 stands for Generative Pre-trained Transformer. It is an auto-regressive language model that uses deep learning to produce human-like text. OpenAI, an AI research and deployment company, unveiled this technology with 175 billion language parameters. It is the third-generation language prediction model in the GPT-n series and the successor to GPT-2, created by Microsoft-funded OpenAI.

You can use GPT-3 for many uses, including creating product descriptions for your e-commerce store.

How to create product description with GPT-3

GPT-3 can predict which words are most likely to be used next, given an initial suggestion. This allows GPT-3 to produce good sentences and write human-like paragraphs. However, this is not an out-of-the-box solution to perfectly drafting product descriptions for an online store.

When it comes to customizing GPT-3’s output, fine-tuning is the way to go. There is no need to train it from scratch. Fine-tuning allows you to customize GPT-3 to fit specific needs. You can read more about customizing GPT-3 (fine-tuning) and learn more about how customization improves accuracy over immediate design (learning a few strokes). 

Fine-tuning GPT-3 means providing relevant examples to the pre-trained model. These examples are the ideal descriptions that simultaneously describe the product, characterize the brand, and set the desired tone of voice. Only then could companies begin to see real value when using AI-powered applications to generate product descriptions. 

Examples of the power of GPT-3 for e-commerce product descriptions are in this article, where we show you two different cases: 

  • Using the pre-trained model of GPT-3 without fine-tuning it
  • Fine-tuning the pre-trained model with relevant data

GPT-3 for product description: learn how to train your model to produce human-like text with machine learning. 

How Does SEO Automation Work? 

Here is how you can proceed when approaching SEO automation. It is always about finding the right data, identifying the strategy, and running A/B tests to prove your hypothesis before going live on thousands of web pages. 

It is also essential to distinguish between:

  • Deterministic output – where I know what to expect and
  • Stochastic output – where the machine might generate a different variation every time, and we will need to keep a final human validation step.   

I believe that the future of SEO automation and the contribution of machine/deep learning to digital marketing is now. SEOs have been automating their tasks for a while now, but SEO automation tools using AI are just starting to take off and significantly improve traffic and revenues.

Are you Interested in trying out WordLift Content Intelligence solutions to scale up your content production? Book a meeting with one of our experts or drop us a line.  


The image we used in this blog post is a series of fantastical objects generated by OpenAI’s DALL·E new model by combining two unrelated ideas (clock and mango).

Leverage Advanced Tools for Enhanced Content Creation

While exploring the vast landscape of SEO automation, it’s crucial to recognize the pivotal role of content in driving SEO success. High-quality, engaging content remains at the heart of SEO, attracting organic traffic and improving search rankings. To this end, embracing advanced content creation tools can significantly amplify your SEO strategy.

Our content creation tool offers a sophisticated solution, harnessing the power of AI to generate compelling content that resonates with your audience. By integrating this workflow into your SEO automation process, you can streamline content production, ensuring consistency and relevance across your digital presence.

Discover how our content generation can revolutionize your content strategy and help you achieve your business goals. Explore the possibilities and start creating captivating content that captivates your audience and drives results. Contact us!

The Future of SEO Automation

Based on the latest trends in SEO automation for 2024, as highlighted in our SEO Trends for 2024 article, it’s clear that the integration of artificial intelligence (AI) and the use of structured data are becoming increasingly important. The ability to leverage AI for in-context learning, instructions-based prompts, and other emerging behaviors of foundational models is crucial. This is where the concept of an AI SEO Agent becomes particularly relevant.

An AI SEO Agent represents a significant evolution in the field of SEO automation. It acts as a specialized AI assistant that can automate and optimize a wide range of SEO tasks with greater efficiency and accuracy. From conducting sophisticated keyword research and content optimization to executing strategic link-building efforts, the AI SEO Agent leverages advanced algorithms and machine learning techniques to enhance SEO strategies. This automation not only saves time but also ensures that SEO practices are aligned with the latest search engine algorithms and user search behaviors.

This is anchored to an extensible data fabric and brand authority, which allows for more sophisticated and effective SEO strategies. Additionally, the introduction of product knowledge graph panels by search engines necessitates that merchants continuously adapt their strategies to keep pace with these evolving trends.

A prime example of the importance of staying ahead in SEO through innovative strategies is seen in our case study with EssilorLuxottica. This collaboration showcases how the use of generative AI, structured data markups, and the strategic implementation of an AI SEO Agent can revolutionize SEO strategies, driving unprecedented innovation and significantly enhancing online visibility and revenue at scale.

More Questions On SEO Automation

What is SEO automation?

You can automate SEO tasks by using AI-powered tools and software that help in various aspects of SEO, including content creation, keyword research, link building, and performance analysis. These tools can significantly reduce the manual workload by providing insights, suggestions, and automating repetitive tasks.

Are there risks associated with SEO automation?

While SEO automation can streamline many tasks, it’s important to be aware of the risks. Over-reliance on automation without human oversight can lead to issues such as non-compliance with SEO best practices, creation of irrelevant content, or even penalties from search engines if the automation is perceived as manipulative. It’s crucial to balance automation with manual review and intervention.

Which SEO tasks can be automated, and which ones require manual intervention?

Tasks like data analysis, keyword research, and certain aspects of content optimization can be effectively automated. However, tasks requiring creative input, strategic decision-making, and nuanced understanding of content quality and relevance still require manual intervention. It’s essential to identify which tasks can be automated to save time without compromising the quality of your SEO strategy.

What are the best SEO automation tools?

The best SEO automation tools are those that offer a comprehensive suite of features to address various aspects of SEO, from keyword research and content optimization to link building and performance tracking. Tools that leverage AI and machine learning to provide actionable insights and automate repetitive tasks are particularly valuable.

Case Study: Innovating Through AI and SEO at EssilorLuxottica

Case Study: Innovating Through AI and SEO at EssilorLuxottica

Strategic integration of AI and e-commerce SEO is the key to success for companies aiming to drive change and innovation. Under the leadership of its SEO and editorial teams and working closely with WordLift’s team of experts, EssilorLuxottica has achieved remarkable progress in various aspects of its operations.

By examining its journey, your organization can discover valuable insights and opportunities to move forward. This strategic implementation goes beyond a single e-commerce platform, encompassing multiple channels. With a professional and results-oriented approach, your company can mirror EssilorLuxottica’s success, leveraging the power of AI and SEO to establish itself as a pioneer in your industry.

Let’s read this new amazing SEO case study 👓

EssilorLuxottica commands a global presence as a preeminent force in creating, manufacturing and disseminating ophthalmic lenses, frames, and sunglasses. With operations across 150 countries, the company is an open network enterprise, extending access to high-quality vision care products for industry stakeholders. Its esteemed portfolio features eyecare and eyewear brands, including Ray-Ban, Oakley, and Persol. Complementing these offerings are advanced lens technologies such as Transitions, Varilux, and Crizal, alongside essential prescription laboratory equipment and pioneering solutions, all designed to address the diverse needs of customers and consumers worldwide.

The Main Goal

In our collaboration, we have embarked with the EssilorLuxottica team on a journey of commitment to industry innovation and progress.

The goals set included:

  • Infusing innovation into workflows.
  • Aligning with the latest industry trends.
  • Providing ongoing professional support to the Group’s in-house SEO team. 

Special attention was paid to the safe integration of generative AI into critical SEO activities, such as structured data implementation, knowledge graph creation, and refinement of product descriptions. All of these initiatives were pursued to foster continuous innovation, drive scalable revenue growth, and optimize workflows for greater efficiency. This collaborative effort underscores the profound impact that the strategic integration of AI and SEO can have on business operations in the industry.

Challenges

In the collaborative journey of WordLift and EssilorLuxottica, the pursuit of enhancing SEO and refining content strategies has been a dynamic and rewarding process. Despite the successes, the path to optimization had its share of challenges. These challenges, though formidable, served as crucibles for innovation and adaptation, ultimately strengthening the collaborative bond between the two teams.

One initial challenge encountered was managing and processing a large volume of data, an essential aspect of improving SEO. The teams met this challenge by implementing efficient data management strategies, ensuring the use of all the information to its full potential. As campaigns developed and stock availability fluctuated, teams needed agile adjustments to accommodate the dynamic nature of specific campaign periods and stock availability.

The continuous evolution of search engine algorithms posed another challenge. Staying ahead of these changes required constant vigilance and adapting quickly. Teams demonstrated responsiveness to algorithmic updates, ensuring their SEO strategies aligned with the latest search engine trends. At the same time, changes in AI models and technologies emerged as crucial considerations, requiring a proactive approach to adaptation.

In the context of a large SEO team, the complexities of development and implementation further increased the challenges. Coordinating efforts and maintaining cohesion in such an environment require strategic planning and effective communication. The teams faced this challenge head-on, ensuring the collective experience was leveraged to its full potential.

In the face of these challenges, the collaborative effort between WordLift and EssilorLuxottica demonstrated resilience and highlighted the teams’ ability to turn obstacles into opportunities. This harmonious collaboration enabled the teams to overcome challenges and paved the way for creating a robust and adaptable SEO strategy. The resulting online presence and visibility improvement set the stage for a deeper exploration of the solutions implemented during this transformation journey.

Solutions & Results

Responding to the client’s request and considering the challenges explained above, we built a strategy of various solutions that addressed specific needs. The individual solutions adopted are part of an SEO strategy aimed at the innovation and growth of the brand and business. 

Create an Eyewear Ontology

Ontology in SEO refers to the structured representation of knowledge that aids search engines in understanding the context and meaning of search queries. This enables them to deliver more accurate and relevant search results. By incorporating an ontology into your SEO strategy, you enhance the visibility of your content. This means search engines can better comprehend and connect it with user queries, resulting in higher rankings and increased traffic to your website.

In the journey with WordLift, EssilorLuxottica achieved a significant milestone by introducing the inaugural release of an Eyewear Ontology. This meticulously designed framework serves as a comprehensive map, capturing eyewear products’ intricate nuances. Through the implementation of this ontology, EssilorLuxottica now possesses an exceptional capability to ingest, describe seamlessly, and query product attributes directly from their Product Information Management (PIM) system. Furthermore, the ontology empowers them to automate the generation of product descriptions with precision and efficiency.

By constructing prompts tailored to the specific demands of the task at hand, EssilorLuxottica has harnessed the potential to produce content that not only meets but exceeds expectations. This, coupled with the ability to validate generated content, ensures accuracy and relevance that distinguishes EssilorLuxottica in the digital landscape. The Eyewear Ontology is a testament to their commitment to innovation and an unwavering pursuit of excellence in every facet of their partnership with WordLift.

Build the Product Knowledge Graph

The Product Knowledge Graph (PKG) is an e-commerce-specific form of knowledge graph designed to improve product discoverability and end-user experience by enriching a brand’s content with data. 

Leveraging WordLift’s expertise in semantic search technologies and content optimization, EssilorLuxottica experienced a revolutionary transformation in how their product information was structured and presented online. A pivotal breakthrough came with implementing an advanced workflow integrating data from Google Merchant Center and the SFTP Feed. This innovative approach brought numerous advantages, most notably the rapid acceleration of data import into the knowledge graph.

EssilorLuxottica can import and validate about 1000 products per minute, catapulting efficiency to unprecedented levels. Furthermore, this workflow expedites the resolution of data inconsistencies between Structured Data and Merchant feeds, all while introducing automatic schema validation, ensuring a seamless and error-free user experience. This groundbreaking enhancement streamlines processes and significantly amplifies the brand’s visibility in the digital realm.

Through the implementation of a Product Knowledge Graph (PKG) workflow, we witnessed a significant boost in organic traffic across their various brands. Over the last three months, there was an impressive 16% increase in clicks per website compared to the previous year’s period. This demonstrates the tangible impact of our innovative approach to enhancing online visibility and engagement for EssilorLuxottica.

Experiencing The Power Of Dynamic Internal Links

Dynamic internal links are crucial in enhancing the SEO performance of product listing web pages (PLP). By automating the process of recommending links to users, we facilitate their navigation through similar category pages, ultimately reducing click-depth and improving user experience. Additionally, these internal links provide invaluable signals to search engines, aiding them in comprehending the organizational structure of categories. This, in turn, contributes to higher rankings in search results. 

Furthermore, the strategic use of anchor text optimization can further bolster the visibility of specific queries. From a business perspective, recommending categories empowers us to prioritize sales campaigns, promote key product categories, and appropriately manage those with lower business impact, such as out-of-stock items. 

Implementing dynamic internal links has proven to be a strategic move for EssilorLuxottica in enhancing user engagement and optimizing content focus. The initial experiment on a retailer website in the US yielded promising results, with a noticeable increase in clicks. Furthermore, the data analysis revealed a reduction in the number of queries per page on the variant, indicating a higher level of content precision and relevance. This not only enhances the user experience but also strengthens the overall SEO strategy. 

  • 30% clicks (April-September YoY) 
  • 20% increase in average position 

Encouraged by these positive outcomes, we are currently rolling out the same dynamic internal link experiment on other websites to replicate and build upon these advantageous results. This initiative aligns with EssilorLuxottica’s commitment to employing innovative solutions to continually refine and elevate the digital experience for their customers.

Generate FAQs by Using AI

We developed an innovative FAQ workflow designed with precision and efficiency in mind. It encompasses improved sourcing methods to prevent query cannibalization, refined content summarization through the utilization of fine-tuned models, and the unique ability to instruct the model using existing page content. Additionally, our process allows for seamless extraction of pertinent questions directly from the content, ensuring a comprehensive and accurate FAQ section. This tailored approach sets the stage for an efficient and user-friendly experience on EssilorLuxottica’s platform.

Through the implementation of AI-powered technology, EssilorLuxottica has achieved remarkable advancements in generating FAQs for 5 e-commerce websites.

  • 22% increase in clicks on lifted PLPs with Generated Question&Answers + FAQpage markup (January-December 2023) YearOverYear Comparison

By implementing a specialized reporting dashboard through our Looker Studio Connector, we’ve equipped EssilorLuxottica with a powerful tool to monitor and evaluate performance metrics across its wide range of offerings. This empowers them to make informed decisions and refine their strategies based on data-driven insights. The integration of AI-driven solutions has significantly elevated EssilorLuxottica’s efficiency, precision, and strategic acumen in their digital marketing endeavors.

Generate Content at Scale using LLM and KG

Employing a meticulous process of data preparation and model building, EssilorLuxottica has achieved a significant milestone in content generation

Our approach to AI-generated content revolutionizes the content creation landscape, bridging the gap between human creativity and AI capabilities. We address the challenge of subpar automated content, ensuring a higher quality standard. 

With a Knowledge Graph-centric methodology, we eliminate the need for extensive external data, emphasizing sustainability and ethical AI use. Our validation rules further enhance precision—this fusion of AI and human oversight results in top-tier content, robust control, and eco-conscious practices.

By developing a specialized dashboard, we’ve developed a powerful content generation tool that not only aids in the content generation process but also serves as a valuable resource for the EssilorLuxottica team, ensuring seamless integration and accessibility for future endeavors. This strategic advancement underscores our commitment to leveraging cutting-edge technology to enhance the brand’s digital presence and solidify EssilorLuxottica’s position as a leader in the eyewear industry. Using it, now we are able to produce +1000 completions per minute. Our tailored approach has not only ensured a consistent and high-quality output but has also provided a substantial boost to the overall content production capacity.

Analyzing the data from July 1st to September 1st in a Year-over-Year comparison, we observe a commendable stability with a noteworthy increase of +5.4% in clicks.

Opportunities

In its pursuit of digital excellence, EssilorLuxottica has harnessed the transformative power of data control. This key aspect has driven it to create tailored content and customized models for its audiences. This newfound mastery of data has laid the foundation for a dynamic and personalized digital experience.

A significant leap forward was made by implementing an automation system powered by artificial intelligence that synchronized multiple data points. This ensured the delivery of up-to-date information and helped create a streamlined and efficient process (PKG). The accuracy of data synchronization became a milestone, improving the overall accuracy of content delivery and reinforcing EssilorLuxottica’s commitment to excellence.

Throughout the journey, the significant increase in the learning curve became a testament to the collaborative efforts between WordLift and the EssilorLuxottica team. The challenges were not simply obstacles but opportunities for growth. This enhanced collaboration improved the teams’ collective knowledge and resulted in more effective strategies and streamlined operations.

Adopting AI technologies has emerged as a proactive measure, especially for navigating the dynamic digital landscape during critical times. The integration of AI facilitated adaptability and allowed EssilorLuxottica to remain at the forefront of an ever-changing technological environment. The proactive approach ensured that strategies remained current and forward-looking, aligning perfectly with the company’s vision for digital innovation.

In conclusion, EssilorLuxottica’s journey reflects a transformative adoption of data control, AI-based automation, and a commitment to continuous learning. These strategic initiatives have positioned the company as a digital leader and laid the foundation for sustained success in an ever-evolving digital ecosystem. As EssilorLuxottica continues to pioneer innovation, the synergy with WordLift remains a driving force, propelling it to new heights of digital excellence.

Note on the methodology used

The methodology that we used to calculate the achieved outcomes is causal inference

It is a branch of statistics that helps us isolate the impact of the change by comparing the results achieved with the predicted results we would have had without the implementation by the team. 

The prediction uses a Bayesian structural time-series model that learns from the clicks we had before the intervention (occurring during the dedicated period) on the pages considered.

To perform the evaluation, we used CausalImpact, an open-sourced Google tool that created baseline values for the period after the event.

“Our long lasting collaboration with Wordlift helped the e-commerce division of EssilorLuxottica moving ahead of the trends, supporting the SEO activities of the group to achieve more revenues at scale thanks to the innovative usage of markups and AI”.

Federico Rebeschini – Global Head of SEO and Performance at EssilorLuxottica

AI LAWS and SEO: Stay Ahead of the Curve to Ramp Up Your Search Game 

AI LAWS and SEO: Stay Ahead of the Curve to Ramp Up Your Search Game 

Are you a digital copywriter or SEO leader? Let’s dive into the EU AI Act and Biden’s AI order – the pioneer rulebooks in artificial intelligence’s wild world and the first, most serious attempts to regulate AI worldwide. These regulations aren’t just legal GPS but treasure maps riddled with compliance challenges and SEO opportunities. Picture them as the high seas of the internet, where businesses need to navigate the waters of AI-driven SEO with the finesse of a captain steering clear of copyright and privacy icebergs.

Decoding the Tech Talk Tango

Unraveling the EU AI Act is like deciphering a secret code for SEO pros riding the wave of Large Language Models (LLMs). It’s a game-changer, ushering us into a new era that demands a touch of SEO sophistication when dancing with LLMs. The implications are like upgrading from a tricycle to a turbocharged motorcycle – a whole new level of strategy needed for businesses setting up camp in the EU.

Staying Ahead with AI-Driven SEO

As the AI revolution unfolds, businesses are witnessing transformative impacts on SEO. Case studies reveal that AI-driven SEO can significantly enhance online visibility, providing valuable insights through advanced rank tracking and analysis. Moreover, the result of generative AI on SEO cannot be ignored, as it aids in generating unique, high-quality content and positively influencing search engine rankings – when done right.

In the competitive field of digital marketing, staying ahead of AI laws is not just about compliance; it’s a strategic move to ramp up your search game. Embrace the evolving landscape, integrate AI responsibly into your SEO strategies, and watch your online presence soar. 

Is it that simple?

Let’s dissect ongoing efforts to regulate AI for the first time in history and try to understand how this will impact your SEO strategy. The time is now.

The EU AI Act, spearheaded by the European Commission, aims to safeguard users and facilitate interaction within secure, unbiased, transparent, and environmentally conscious digital spaces. Defining what constitutes an AI system proves to be a challenging endeavor. It’s not straightforward to categorize whether a given setup qualifies as an AI system, and this complexity persists even if it lacks the intricacy or corporate infrastructure typically associated with major tech companies like Google.

Contrary to the misconception that AI necessitates a high level of complexity or a substantial corporate framework, the reality is more inclusive. Anyone, not just a Google engineer, can develop an AI system. Even a setup as basic as an Excel formula, enhanced with some AI modifications, can be considered an AI system. The boundaries in this regard are elusive and difficult to pinpoint.

The EU AI Act Compliance Checker tool available on the Artificial Intelligence Act website is a helpful resource in navigating the nuances of the EU AI Act. With just a few clicks, users can identify their provider type and assess the legal implications of relevant articles within the law. While the tool may need to be more impeccably precise, our experience in AI SEO suggests that it serves as a valuable starting point in the journey of AI SEO.

Repeatedly emphasizing the potential benefits of user protection, we acknowledge the concurrent risk of impeding innovation within the EU landscape. It prompts us to question whether the pursued measures strike the right balance between safeguarding interests and encouraging technological progress.

Considering both perspectives, let’s contemplate a scenario where AI remains unregulated. In such a case, the adoption and acceleration of AI could occur at an unprecedented pace. We might witness significant advancements in Artificial General Intelligence (AGI) capabilities within a few years. AGI systems could potentially contribute to finding cures for previously insurmountable diseases such as cancer and HIV. The race to be the first to achieve this breakthrough would confer a monumental advantage, potentially light-years ahead of others in cosmic terms. However, this prompts us to reflect on the associated costs. Should we permit unbridled production without considering the ethical implications? The examples of incidents like Cambridge Analytica and the security flaws in creating CustomGPTs underscore the need for a careful balance.

In an AI-first world, the voices of creators must not be overshadowed. Ensuring their protection becomes paramount. The question lingers: how do we guarantee that individuals navigating the frontiers of AI are shielded from exploitation and that ethical considerations are not sacrificed in the pursuit of progress?

One could see how everything can go south without protecting creators and encouraging people to join the AI era. We could witness money being concentrated in small groups of people instead of more equal distribution when anyone creative and helpful to society benefits from offering AI-enriched products to the end user.

But today, it is not a debate whether AI should or shouldn’t be regulated. However, I still felt the urge to share my initial insights because it will be much more complex for us to navigate this legal space – more than ever.

The need for legal layers: EU AI Act, Biden’s order, and internal legal processes

Whether you’re a small to medium-sized enterprise (SME) or a large corporation, you’ll probably have to collaborate closely with your legal team to establish the legal parameters for your company about serving end users. 

Stanford Research recently published an excellent paper titled “Do Foundation Model Providers Comply with the Draft EU AI Act? In this paper, they pinpoint various indicators for EU AI compliance derived from the draft version of the EU AI Act. Let’s take a closer look at their findings:

According to this graphic, data sources, data governance, copyrighted data, compute power, energy power, capabilities & limitations, evaluations, testing, machine-generated content, member states, and downstream documentation are the leading indicators for LLMs compliance. In layman’s terms, you shouldn’t use a generative AI solution that is not EU AI Act-friendly or respects these criteria. While it’s true that you should be playing with different models for different purposes, it’s always safe to rely on compliant LLM models like Bloom to ensure your brand reputation and positive law implications for your company.

Moreover, it’s not just a matter of your internal models. In navigating the landscape of AI regulations, consider this journey an exploration where having a dependable digital partner is akin to a guiding light in the dark. This partner can assist you in avoiding the pitfalls made by other companies similar to yours, enabling you to leverage AI to your advantage.

Our advice to clients emphasizes the importance of transparency in AI and data processes within each company, especially for the data and AI teams. Different divisions and teams should be well-informed about ongoing AI and data initiatives, ensuring no redundancies and contributing to cost efficiency. In this regard, we advocate for senior management to take a top-down approach, leading the way for everyone on this journey. Collaborating with external digital agencies with the expertise to support senior leadership is a key component of this approach. 

Why? 

Based on our extensive experience collaborating with SEO industry leaders, they are certainly inclined towards experimentation. However, it’s also evident that they are heavily engrossed in their stakeholder and people management processes, leaving them with insufficient time to delve into the study and rapid implementation of AI. In a landscape where trends evolve rapidly, the crucial skills for prospective innovative leaders will revolve around a visionary mindset. It involves identifying and capitalizing on opportunities precisely when they arise, in the proper context, and for the right reasons. It’s important to take into account the complexity of this task.

The same holds for American SEO leaders and entrepreneurs. The White House’s recently unveiled executive order on October 30, 2023, introduces a comprehensive and far-reaching set of guidelines for artificial intelligence. This move by the U.S. government signals a concerted effort to tackle the inherent risks associated with AI.

From my perspective as a researcher specializing in information systems and responsible AI, the executive order marks a significant stride toward fostering responsible and reliable AI practices.

However, it’s crucial to recognize that this executive order is just the beginning. It highlights the need for further action on the unresolved matter of comprehensive data privacy legislation. The absence of such laws exposes individuals to heightened risks, as AI systems may inadvertently disclose sensitive or confidential information.

Last week, the U.S. produced the first official document to regulate AI that matches the EU AI Act. Let’s debunk this one, too, shall we? Here’s what Biden’s regulation from 30th October means for SEO practitioners and digital content creators:

  1. AI Safety and Security Boost: President Biden’s recent executive order on AI, issued on October 30, 2023, establishes groundbreaking standards for AI safety and security, aiming to shield Americans from potential risks.
  1. Privacy Protection in the AI Arena: The order emphasizes the need to protect Americans’ privacy and civil liberties in the face of advancing AI technologies, setting a clear stance against unlawful discrimination and abuse.
  1. Implementation Guidance for Responsible AI Innovation: The Office of Management and Budget (OMB) has released implementation guidance post-order, focusing on AI governance structures, transparency, and responsible innovation. This move is a strategic play to ensure AI’s benefits are harnessed responsibly and ethically.
  1. Deciphering the Legalese: The executive order has been dissected by experts, with insights suggesting a substantial impact on mitigating AI risks. It prompts a closer look at the promises and potential delivery of a safer AI landscape.
  1. AI Risks Mitigated for All: President Biden’s directive is a bold step to reduce AI risks for consumers, workers, and minority groups. It aims to ensure the benefits of AI are widespread and that no one is left behind in the digital revolution.

It’s abundantly evident – that navigating the realm of AI innovation poses a genuine challenge, particularly for SEO practitioners like yourself. When we factor in the additional detail that most SEO users lack a formal background in law and computer science simultaneously (as per LinkedIn data and keyword filtering), it becomes apparent that these new regulations have injected a heightened level of complexity into SEO processes.

What penalties or consequences exist for non-compliance with AI-related SEO regulations?

If you ask me how to sell this to your upper management, we can see it from two key perspectives:

  1. The impact on finances,
  2. The impact on your company’s brand image.

Despite appearing unfair, financial challenges and negative cash flow serve as the primary driving forces for senior management and SEO leaders. Effectively communicating and quantifying adverse impacts, as demonstrated by the EU AI Act checker, provides a compelling reason for management to take notice. Unfortunately, people tend to act based on fear when anticipating negative consequences. While this reality may be disheartening, if you aspire to propel your AI project forward and ensure legal compliance, it’s crucial to grab the attention of your managers and present a terrifying financial projection story.

The second challenge, brand image, is even more intricate than the first. Unlike financial issues, they’re not easily rectifiable and carry financial implications, similar to the finance impact case study you must prepare. Why is this important? Consider the scenario where people associate you and your company with AI-law violations. It can lead to a decline in motivated staff, a gradual loss of your customer base, and, ultimately, the declaration of bankruptcy for your business. Even if you establish a new company, your reputation as a senior leader in this tarnished brand journey will hinder your ability to conduct serious business and establish a socially responsible venture. The risks are simply too high.

WordLift is a trustworthy and equitable ally ready to lend support in this AI-centric era. Our tech stack is meticulously designed and structured in alignment with AI regulations. We continuously refine and adapt it based on insights from collaborating with diverse, innovative clients. We prioritize ethical AI principles in our work and embrace a creator-first mindset, a unique approach not commonly adopted by many agencies. Despite the additional complexity and overhead it introduces, it’s the right long-term strategy. As creators ourselves, we collaborate with other creators and firmly advocate for a creator-centric approach as our guiding manifesto.

In simpler terms, our commitment extends to developing responsible AI systems that prioritize fairness, user experience, and impartiality. We emphasize obtaining proper user consent and encourage clients to invest in maintaining high data quality standards. Our G-RAG (Graph Retrieval Augmented Generation) systems embody these values, seamlessly integrating principles into our workflows. We also assist clients in understanding and implementing practical knowledge transfer sessions to enhance their generative AI search capabilities.

I’d like to express our gratitude to those who have chosen us as their trusted digital partner in navigating the complexities of AI regulation and LLM. We’re excited about propelling your success through our internally developed tech stack and workflows.

More Frequently Asked Questions

How does AI regulation impact search engine optimization strategies?

Data Privacy Compliance:

AI regulations enforce strict guidelines on data privacy and protection. SEO strategies handling user data must meet these regulations, necessitating enhanced security measures, explicit user consent, and transparent data usage practices.

Algorithm Transparency:

Some AI regulations stress algorithm transparency. Search engines utilize complex AI algorithms for rankings. SEO professionals should align their strategies with regulations promoting transparency, mainly when dealing with user data.

Bias and Fairness:

AI regulations address bias and fairness concerns in algorithms. SEO strategies should minimize bias in search results, ensuring fair representation. This involves regular review and adjustment of keyword targeting, content creation, and other SEO elements to prevent unintentional biases.

User Rights and Consent:

Regulations grant users rights over their data and stress obtaining informed consent. SEO strategies must respect these rights, aligning website practices with regulations to give users control over their data and understand its use.

Ethical AI Practices:

AI regulations advocate ethical AI practices. SEO strategies involving AI, like chatbots or automated content generation, must adhere to ethical guidelines, avoid deceptive practices, provide accurate information, and ensure a positive user experience.

Legal Compliance and Penalties:

Non-compliance with AI regulations may lead to legal consequences and penalties. SEO professionals must stay informed about relevant regulations and adjust strategies to avoid legal issues.

Monitoring and Adaptation:

As AI regulations evolve, SEO strategies must be flexible and adaptive. Regular monitoring of regulatory changes is vital for ongoing compliance. This may involve adjusting keyword strategies, content creation, and data handling practices to align with the latest regulatory requirements.

Are there specific compliance requirements for AI-powered SEO tools?

The EU AI Act and Biden’s Executive Order from October 30, 2023, aim to enforce stringent AI safety, security, and privacy standards. While specific details on AI-powered SEO tools are not explicitly outlined, compliance is likely required in areas such as data privacy, transparency in algorithms, and avoidance of bias to align with the broader AI regulations. SEO professionals should consider implementing enhanced security measures, obtaining explicit user consent, ensuring transparency in algorithmic processes, and minimizing bias in search results to meet potential compliance requirements.

What are the ethical considerations in using AI for SEO?

Transparency and Accountability: Ethical AI use in SEO requires transparency, disclosure, and accountability.

Bias and Discrimination: AI in SEO must address bias, discrimination, and privacy issues.

Authenticity of Content: The authenticity of AI-generated content is a primary concern, needing more human touch and posing challenges to genuine expression.

Minimizing Biases: Algorithms should be trained on diverse and unbiased data to mitigate biases in AI-generated content.

Fairness: Ethical AI use demands that systems do not discriminate against specific groups based on traits such as race, gender, age, or financial status.

Scalenut

Scalenut

Scalenut is an AI-powered co-pilot designed to manage the entire SEO lifecycle. Aimed at making the SEO process more streamlined and effective, Scalenut offers a range of functionalities that assist in various aspects of SEO strategy and implementation.

With Scalenut, users can effortlessly conduct keyword research and competitor analysis, giving them an edge in understanding market dynamics. This empowers you to make data-driven decisions, effectively targeting keywords that are not just high in search volume but also relevant to your specific audience. Scalenut also provides valuable insights into content gaps, suggesting areas where you can create impactful content to capture more organic traffic.

One of the standout features of Scalenut is its content creation and optimization engine. By leveraging advanced AI algorithms, it aids in the development of content that not only ranks well but also resonates with your target audience.

It takes into account critical factors such as content length, keyword density, and readability, ensuring that you produce content that is both SEO-friendly and user-centric.

For those looking for an all-in-one solution to tackle the complexities of SEO, from planning to execution, Scalenut offers a robust and versatile platform that adapts to your specific needs.

Fine-Tuning GPT-3.5: Unlocking SEO Potential with Structured Data

Fine-Tuning GPT-3.5: Unlocking SEO Potential with Structured Data

Introduction

The GPT-3.5 Turbo model by OpenAI has been a game-changer as it democratized access to large language models on an unprecedented scale. At WordLift, we’ve been intensively investing in fine-tuning these language models to meet the unique needs of our enterprise clients. Our focus is on enhancing the quality of the generated content and ensuring that it aligns seamlessly with an organization’s core values, tone of voice, and content guidelines.

In this blog post, we’ll delve into the fine-tuning process of GPT-3.5 Turbo and explore how surprisingly simple it can be, particularly when integrated with WordLift’s knowledge graph or any existing structured data.

Jump directly to the Colab 🪄

https://wor.ai/fine-tuning-demo

The Imperative of Fine-Tuning for SEO

While the base GPT-3.5 Turbo model is incredibly versatile, it often serves as a generalist rather than a specialist, particularly in the nuanced field of SEO. Fine-tuning The Fine-Tuning Process. Fine-tuning solves this challenge by allowing us to train the model on specific data sets. This enhances its ability to generate optimized content and ensures that it resonates with your organisation’s unique style, tone, and guidelines, much like an in-house writer would.

The Fine-Tuning Process

Fine-tuning GPT-3.5 Turbo is straightforward and I added a few parameters to make the process customizable. You can directly jump to the Colab Notebook; here are the steps:

  1. Data Preparation with WordLift:
    • GraphQL Query: We start by using a GraphQL query to extract content from our blog. Since this content has already been marked up with Schema, the query will return a curated selection of articles.
    • Segmentation and Chunking: Next, we segment these articles by looking at the headings and apply a chunking method, as described in the code, to prepare the data for training. You have the option in the code of adding multiple sentences (4 is the default value) within each chunk, right after each heading.
  2. Data Validation and Token Estimation:
    • Before proceeding, we use a function provided by OpenAI to validate the prepared data and estimate the total number of tokens in our training dataset.
  3. API Calls for Fine-Tuning:
    • With the validated and token-estimated dataset, we then use OpenAI’s API to fine-tune the model.
  4. Quality Testing:
    • For the initial evaluation, we take a comparative approach. We generate content using a given prompt for both the fine-tuned model and the standard GPT-3.5 Turbo model. By analyzing the output from both, we can assess how well the fine-tuned model aligns with our SEO and content quality standards, as well as how it maintains the unique style and guidelines of your organization. We will also use the fine-tuned model in the context of a Retrieval Augmented Generation (RAG) that uses WordLift LangChain / LlamaIndex connector.
A simple Python script provided by OpenAI which you can use to find potential errors, review token counts, and estimate the cost of a fine-tuning job.

A Python script provided by OpenAI to find potential errors, review token counts, and estimate the cost of a fine-tuning job.

SEO-Centric Use Cases and WordLift’s Content Generation Tool

  • Content Generation: Produce SEO-optimized product descriptions, introductory text for category pages and many SEO programmatic tasks where you can use structured data in your prompts.
  • Keyword Analysis: Generate keyword-rich content that aligns with search engine algorithms and your written content.
  • Dynamic Link Building: Automatically create and manage internal links for SEO optimization, as our blog post explains.

Beyond these use cases, the fine-tuned model can be seamlessly integrated into a Retrieval-Augmented Generation (RAG) system that I have built using the WordLift connector for LlamaIndex. This allows for even more advanced SEO-centric applications, such as contextual query answering, content generation and semantic search optimization.

Testing the new fine-tuned model with RAG and LlamaIndex

Here are a few examples demonstrating how the newly-created model operates within an Agent. This Agent is built using LlamaIndex and Chainlit, a Python framework designed for constructing conversational user interfaces, and it’s integrated with the knowledge graph of this blog.

This is a comparison of the generation between the the fine-tuned model and the standard model (ChatGPT 3-5 Turbo). 

This is done inside a RAG that uses the content of the WordLift blog.

We can add in Chainlit the option to choose different models for our RAG Agent, this is a quick way to validate the results of a fine-tuned model with different queries.

WordLift new Content Generation tool

One of the most exciting applications of our fine-tuned models is their integration with WordLift’s new content generation. This tool leverages the capabilities of the fine-tuned model to produce high-quality, SEO-optimized content that aligns perfectly with your organization’s unique voice and guidelines. For more information on how to make the most of this innovative tool, check out the WordLift Content Generation Documentation. We’ll dive deeper into a next blog post!


Code update 🔥: taking the fine-tuning one step further

I have added a new section in the Colab where you can explore a different approach to fine-tuning. We start, also in this case, from content marked up as a schema Article, but, this time, we use Llama Index, a robust framework for interacting with large language models, and the WordLift Reader (a connector for Llama Index) to:

  1. Extracting documents from the Knowledge Graph (KG).
  2.  Creating a dataset of questions using ChatGPT 3.5 based on the articles from the blog.
  3.  Utilizing GPT-4 to answer the generated questions using an index of all the pieces.

Here, we can see the process of generating synthetic questions that will be answered to create a new fine-tuning file.

Incorporating Llama-Index into the fine-tuning process adds another layer of sophistication, enabling us to create a more accurate content generation system.

Once again, multiple strategies can be combined to find the best mix of samples for a given website.

Conclusion

Fine-tuning GPT-3.5 Turbo, particularly when integrated with WordLift‘s knowledge graph or any other structured data, opens a new era in SEO optimization and content creation. Structured data markup not only serves as an invaluable resource for preparing the training dataset but also aids in quality validation, minimizing the model’s potential for generating inaccurate or “hallucinated” content.

A few key findings emerged from this initial experiment:

  1. Minimal Training Examples Needed: Unlike with previous GPT models, we found that only a few training examples are needed to get started. The model’s robustness can be scaled up as we progress; I observed noticeable differences even with just 50-100 samples.
  2. Strategic System Prompts: Given that we’re dealing with a chat model, the role of the system prompt becomes strategic, especially during the fine-tuning process. I’ve added a feature in the Colab notebook that allows you to configure the system prompt according to your specific use-case.
  3. Enhanced Nuances in RAG Systems: When the fine-tuned model is used within a Retrieval-Augmented Generation (RAG) system, it receives additional context. This makes the model more adept at detecting subtle nuances in the content.

Looking ahead, OpenAI has announced that fine-tuning capabilities for GPT-4 will be available later this year, offering even more opportunities for SEO-focused customization. The future is indeed promising!

If you’re interested in learning more about how to leverage structured data for content generation, we invite you to contact us.

Stay tuned for more exciting updates on Generative AI for SEO!

It’s worth mentioning that the allure of creating your customized ChatGPT model does come with a financial consideration. While fine-tuning itself may not break the bank (as costs are relatively limited), it’s important to be aware that the token costs for inferencing on a fine-tuned model are eight times higher than those for the standard model!

References

Here is a list of a few relevant articles:

Elevating Content Relevance: A Free Search Intent Optimization Tool

Elevating Content Relevance: A Free Search Intent Optimization Tool

In the Search Engine Optimization (SEO) world, achieving relevance is a crucial goal driving strategic initiatives and tactical implementation.

A few weeks ago, Paul Thomas and a group of researchers from Microsoft captured Dawn Anderson and subsequently my attention by publishing a revolutionary paper titled “Large language models can accurately predict searcher preferences” on how to use large language models (LLMs) to generate high-quality relevance labels to improve the alignment of search queries and content. 

Both Google and Bing have heavily invested in relevance labeling to shape the perceived quality of search results. In doing so, over the years, they faced a dilemma – ensuring scalability in acquiring labels while guaranteeing these labels’ accuracy. Relevance labeling is a complex challenge for anyone developing a modern search engine, and the idea that part of this work can be fully automated using synthetic data (information artificially created) is simply transformative.

Before diving into the specifics of the research, let me introduce a new free tool to evaluate the match between a query and the content of a web page that takes advantage of Bing’s team insights.

I reverse-engineered the setup presented in the paper, as indicated by Victor Pan in this Twitter Thread.

How To Use The Search Intent Optimization Tool

  1. Add the URL of the webpage you wish to analyze.
  2. Provide the query the page aims to rank for.
  3. Enter the search intent, this is the narrative behind the information needed by the user.

We provide a simple traffic light system to show how well your content matches the search intent. 

(M) Measure how well the content matches the intent of the query. 

(T) Indicates how trustworthy the web page is.

(O) Considering the aspects above and the relative importance of each provides the score as follows:

2 = highly relevant, very helpful for this query

1 = relevant, may be partly helpful but might contain other irrelevant content

0 = not relevant, should never be shown for this query

Let’s Run A Quick Validation Test

While we are still working on conducting a more extensive validation test, here is how the experiment is setup: 

  • We’re looking for top-ranking and least-ranking queries (along with their search intent) behind blog posts on our website;
  • We’re evaluating how the tool considers these two classes of queries;
  • We manually labeled the match between content and query (ground truth) and we are analyzing the gap between the human labels and the synthetic data. 

The exact page (a blog post on how to get a knowledge panel), while trustworthy, is obviously a good match for the query “how to get a knowledge panel” and it doesn’t match at all the query “making carbonara” (ok, this one was easy). 

Here is one more example. In the blog post for AI plagiarism, the tool finds relevancy for the query “ai plagiarism checker” but finds only partially the content relevant for the query “turing test”.

Current Limitations

While this tool is free, its continued availability is not guaranteed. It operates using the WordLift Inspector API, which currently does not support JavaScript rendering. Therefore, the tool will not function if you’re analyzing a webpage rendered client-side using JavaScript. I meticulously replicated the same configuration described in the paper (GPT-4 on Azure OpenAI) but the system is currently running on a single instance and you have to be patient while waiting for the final result.

What We Learned From Microsoft’s Research

Relevance labels, crucial for assessing search systems, are traditionally sourced from third-party labelers. However, this can result in subpar quality if labelers need to grasp user needs. The paper suggests employing large language models (LLMs) enriched with direct user feedback can generate superior relevance labels. Trials on TREC-Robust data revealed that LLM-derived labels rival or surpass human accuracy

When implemented at Bing, LLM labels outperformed trained human labelers, offering cost savings and expedited iterations. Moreover, integrating LLM labels into Bing’s ranking system boosted its relevance significantly. While LLM labeling presents challenges like bias, overfitting, and environmental concerns, it underscores the potential of LLMs in delivering high-quality relevance labeling.

This is incredibly valuable for SEOs when evaluating how the content on a web page matches a target search intent.

Google’s Quality Raters

Google utilizes a global team of approximately 16,000 Quality Raters to assess and enhance the quality of its search results, ensuring they align with user queries and provide value. This Quality Raters program, operational since at least 2005, employs individuals via short-term contracts to evaluate Google’s Search Engine Results Pages (SERPs) based on specific guidelines, focusing mainly on the quality and relevance of displayed results.

Google Quality Raters follow a meticulous process defined by Google’s guidelines to evaluate webpage quality and the alignment of page content with user queries. They evaluate the page’s ability to achieve its purpose using E-E-A-T parameters (Experience, Expertise, Authoritativeness, and Trustworthiness). They also ensure that the content effectively satisfies user needs and search intent.

Although Quality Raters do not directly influence Google’s rankings, their evaluations indirectly impact Google’s search algorithms. Their assessments, particularly regarding whether webpages meet specified quality and relevance criteria, guide algorithm adjustments to enhance user experience and satisfaction. This human analysis is crucial for identifying and mitigating issues, such as disinformation, that might slip through algorithmic filters, ensuring that SERPs uphold high standards of quality and relevance.

Moreover, the Quality Raters’ feedback, especially on the usefulness or non-usefulness of search results, also aids in training Google’s machine learning algorithms, enhancing the search engine’s ability to deliver increasingly relevant and high-quality results over time. This is pivotal for YMYL (Your Money or Your Life) topics, which require elevated scrutiny due to their potential impact on users’ health, finances, or safety. The feedback and evaluations from the Quality Raters, therefore, serve as a valuable resource for Google in its continual quest to refine and optimize its search algorithms and maintain the efficacy of its search results.

To learn more about Google’s quality raters Cyrus Shepherd has written recently about his experience as quality raters for Google. Cyrus’s article is super interesting and informative as always!

Conclusions And Future Work

We aim to continue enhancing our content creation tool by merging knowledge graphs with large language models. Research like the one presented in this article can significantly improve the process of output validation. In the coming weeks we plan to extend the validation tests and compare rankings from Google Search Console with results from the Search Intent Optimization Tool to assess its value in the realm of SEO across multiple verticals.

If you’re interested in producing engaging and informative content on a large scale or review your SEO strategy, drop us an email!