By Andrea Volpini

4 months ago

ChatGPT 3.5 has revolutionized access to advanced language models like never before. Discover the ease of fine-tuning it for SEO optimization, especially when paired with WordLift’s knowledge graph or your own structured data.


The GPT-3.5 Turbo model by OpenAI has been a game-changer as it democratized access to large language models on an unprecedented scale. At WordLift, we’ve been intensively investing in fine-tuning these language models to meet the unique needs of our enterprise clients. Our focus is on enhancing the quality of the generated content and ensuring that it aligns seamlessly with an organization’s core values, tone of voice, and content guidelines.

In this blog post, we’ll delve into the fine-tuning process of GPT-3.5 Turbo and explore how surprisingly simple it can be, particularly when integrated with WordLift’s knowledge graph or any existing structured data.

Jump directly to the Colab 🪄

The Imperative of Fine-Tuning for SEO

While the base GPT-3.5 Turbo model is incredibly versatile, it often serves as a generalist rather than a specialist, particularly in the nuanced field of SEO. Fine-tuning The Fine-Tuning Process. Fine-tuning solves this challenge by allowing us to train the model on specific data sets. This enhances its ability to generate optimized content and ensures that it resonates with your organisation’s unique style, tone, and guidelines, much like an in-house writer would.

The Fine-Tuning Process

Fine-tuning GPT-3.5 Turbo is straightforward and I added a few parameters to make the process customizable. You can directly jump to the Colab Notebook; here are the steps:

  1. Data Preparation with WordLift:
    • GraphQL Query: We start by using a GraphQL query to extract content from our blog. Since this content has already been marked up with Schema, the query will return a curated selection of articles.
    • Segmentation and Chunking: Next, we segment these articles by looking at the headings and apply a chunking method, as described in the code, to prepare the data for training. You have the option in the code of adding multiple sentences (4 is the default value) within each chunk, right after each heading.
  2. Data Validation and Token Estimation:
    • Before proceeding, we use a function provided by OpenAI to validate the prepared data and estimate the total number of tokens in our training dataset.
  3. API Calls for Fine-Tuning:
    • With the validated and token-estimated dataset, we then use OpenAI’s API to fine-tune the model.
  4. Quality Testing:
    • For the initial evaluation, we take a comparative approach. We generate content using a given prompt for both the fine-tuned model and the standard GPT-3.5 Turbo model. By analyzing the output from both, we can assess how well the fine-tuned model aligns with our SEO and content quality standards, as well as how it maintains the unique style and guidelines of your organization. We will also use the fine-tuned model in the context of a Retrieval Augmented Generation (RAG) that uses WordLift LangChain / LlamaIndex connector.
A simple Python script provided by OpenAI which you can use to find potential errors, review token counts, and estimate the cost of a fine-tuning job.

A Python script provided by OpenAI to find potential errors, review token counts, and estimate the cost of a fine-tuning job.

SEO-Centric Use Cases and WordLift’s Content Generation Tool

  • Content Generation: Produce SEO-optimized product descriptions, introductory text for category pages and many SEO programmatic tasks where you can use structured data in your prompts.
  • Keyword Analysis: Generate keyword-rich content that aligns with search engine algorithms and your written content.
  • Dynamic Link Building: Automatically create and manage internal links for SEO optimization, as our blog post explains.

Beyond these use cases, the fine-tuned model can be seamlessly integrated into a Retrieval-Augmented Generation (RAG) system that I have built using the WordLift connector for LlamaIndex. This allows for even more advanced SEO-centric applications, such as contextual query answering, content generation and semantic search optimization.

Testing the new fine-tuned model with RAG and LlamaIndex

Here are a few examples demonstrating how the newly-created model operates within an Agent. This Agent is built using LlamaIndex and Chainlit, a Python framework designed for constructing conversational user interfaces, and it’s integrated with the knowledge graph of this blog.

This is a comparison of the generation between the the fine-tuned model and the standard model (ChatGPT 3-5 Turbo). 

This is done inside a RAG that uses the content of the WordLift blog.

We can add in Chainlit the option to choose different models for our RAG Agent, this is a quick way to validate the results of a fine-tuned model with different queries.

WordLift new Content Generation tool

One of the most exciting applications of our fine-tuned models is their integration with WordLift’s new content generation. This tool leverages the capabilities of the fine-tuned model to produce high-quality, SEO-optimized content that aligns perfectly with your organization’s unique voice and guidelines. For more information on how to make the most of this innovative tool, check out the WordLift Content Generation Documentation. We’ll dive deeper into a next blog post!

Code update 🔥: taking the fine-tuning one step further

I have added a new section in the Colab where you can explore a different approach to fine-tuning. We start, also in this case, from content marked up as a schema Article, but, this time, we use Llama Index, a robust framework for interacting with large language models, and the WordLift Reader (a connector for Llama Index) to:

  1. Extracting documents from the Knowledge Graph (KG).
  2.  Creating a dataset of questions using ChatGPT 3.5 based on the articles from the blog.
  3.  Utilizing GPT-4 to answer the generated questions using an index of all the pieces.

Here, we can see the process of generating synthetic questions that will be answered to create a new fine-tuning file.

Incorporating Llama-Index into the fine-tuning process adds another layer of sophistication, enabling us to create a more accurate content generation system.

Once again, multiple strategies can be combined to find the best mix of samples for a given website.


Fine-tuning GPT-3.5 Turbo, particularly when integrated with WordLift‘s knowledge graph or any other structured data, opens a new era in SEO optimization and content creation. Structured data markup not only serves as an invaluable resource for preparing the training dataset but also aids in quality validation, minimizing the model’s potential for generating inaccurate or “hallucinated” content.

A few key findings emerged from this initial experiment:

  1. Minimal Training Examples Needed: Unlike with previous GPT models, we found that only a few training examples are needed to get started. The model’s robustness can be scaled up as we progress; I observed noticeable differences even with just 50-100 samples.
  2. Strategic System Prompts: Given that we’re dealing with a chat model, the role of the system prompt becomes strategic, especially during the fine-tuning process. I’ve added a feature in the Colab notebook that allows you to configure the system prompt according to your specific use-case.
  3. Enhanced Nuances in RAG Systems: When the fine-tuned model is used within a Retrieval-Augmented Generation (RAG) system, it receives additional context. This makes the model more adept at detecting subtle nuances in the content.

Looking ahead, OpenAI has announced that fine-tuning capabilities for GPT-4 will be available later this year, offering even more opportunities for SEO-focused customization. The future is indeed promising!

If you’re interested in learning more about how to leverage structured data for content generation, we invite you to contact us.

Stay tuned for more exciting updates on Generative AI for SEO!

It’s worth mentioning that the allure of creating your customized ChatGPT model does come with a financial consideration. While fine-tuning itself may not break the bank (as costs are relatively limited), it’s important to be aware that the token costs for inferencing on a fine-tuned model are eight times higher than those for the standard model!


Here is a list of a few relevant articles:

Must Read Content

The Power of Product Knowledge Graph for E-commerce
Dive deep into the power of data for e-commerce

Why Do We Need Knowledge Graphs?
Learn what a knowledge graph brings to SEO with Teodora Petkova

Generative AI for SEO: An Overview
Use videos to increase traffic to your websites

SEO Automation in 2024
Improve the SEO of your website through Artificial Intelligence

Touch your SEO: Introducing Physical SEO
Connect a physical product to the ecosystem of data on the web

Are you ready for the next SEO?
Try WordLift today!