Back to Blog

Structured Data Is Not Enough: Why AI Search Needs a Memory Layer

Our latest research reveals Schema.org alone isn't enough for generative engine optimization. Learn how structured entity hubs improve AI accuracy by up to 29.8%.

By Andrea Volpini & Chiara Carrozza

March 11, 2026

—

12 min read

entities in this article

Generative AI is completely rewriting the rules of online visibility.

We’re moving past the days of simply ranking links; today’s AI engines don’t just find information: they read it, reason over it, and build direct answers for the user.

For SEOs, growth marketers, and innovation teams, this shift raises a massive, and sometimes uncomfortable, question: is simply adding Schema.org markup still enough to win in AI Search?

Our latest research suggests the answer is no, or at least not on its own.

Structured data remains valuable when retrieval is handled by Google or Bing. In those environments, Schema.org can influence how content is interpreted and retrieved. But in vanilla RAG setups, where retrieval often depends more directly on chunking, embeddings, and passage relevance, the impact of structured data is much weaker. We have made this point before on the blog, but now we need to make it unmistakably clear.

We ran an experiment across four different industries, and the results were eye-opening. While embedding standard Schema.org JSON-LD gave AI retrieval systems a slight nudge, the gains were only marginal. However, when we took it a step further and redesigned those pages as structured “entity hubs”—bringing the knowledge graph directly to the surface of the page—we saw AI answer accuracy jump by up to 29.8%.

The takeaway? The secret isn’t just having the right data behind the scenes. It’s about how that data is woven into the fabric of the page.

Why AI Systems Don’t See Structured Data Like Google Does

For years, we’ve optimized for a world where search engines like Google and Bing have “separate lanes” for structured data. These engines use dedicated pipelines to pluck JSON-LD directly from your code and store it separately from your page content. It’s a clean, efficient way to build rich snippets and understand what a page is about.

Modern AI systems, however, are built on a different logic. The retrieval engines powering enterprise assistants and autonomous agents typically ingest your web pages as a single, continuous stream of text. Instead of separating the code from the copy, they convert the entire page into mathematical vectors called embeddings.

In this “all-in-one” process, your JSON-LD is often forced to compete for space with the rest of your content. Because these systems have a limited context window, your structured data can easily be truncated, diluted, or plain ignored during retrieval.

Our research confirmed this reality. When we added JSON-LD to otherwise unstructured pages, we saw only minimal improvements in how accurately the AI could answer questions.

For SEOs and growth teams, the message is clear: Schema markup is an essential foundation, but it isn’t a silver bullet. If the AI can’t perceive the structure within the content itself, it won’t be able to reason over it effectively.

The Key Insight: Treat Your Knowledge Graph as a Memory Layer

The biggest breakthrough in our research wasn’t about the data we had: it was about how we presented it. To bridge the gap between “having data” and “AI understanding,” we had to stop treating our knowledge graph as a hidden layer of code and start treating it as a visible memory layer for the AI.

Instead of hiding structured information deep inside JSON-LD tags, we designed pages that made the knowledge graph visible, navigable, and connected. We call this the Enhanced Entity Page model.

How The Entity Page Model Works

An entity page is essentially the human-readable map of your knowledge graph.

In a graph, entities are defined by their relationships and attributes—what technical teams call RDF predicates. In our model, we take those relationships and turn them into a visible interface that both humans and machines can navigate.

These pages don’t just “contain” information; they architect it. We found the most success when these pages:

Surface properties in natural language: We translate “invisible” attributes into clear, descriptive text.
Expose connections: Internal links between related entities are made explicit, creating a navigable web of knowledge.
Provide clear signposts: By using “navigational affordances” and dereferenceable URIs, we help AI agents understand exactly where they are and how to move through the data.
Add contextual instructions: We provide subtle cues that help AI systems interpret and weight the information they find.

The Results: Accuracy Jumps Across The Board

When we put these Enhanced Entity Pages to the test in our retrieval experiments, the difference was staggering. The same knowledge performed very differently depending on how we presented it to the AI.

Standard retrieval systems improved answer accuracy by 29.6%
Multi step autonomous systems improved by almost 30%
The most optimized configuration achieved a 29.8% improvement

The conclusion is simple: presence is not enough. Just because your data exists in your code doesn’t mean an AI can reason over it. In the world of agentic search, structure and navigability are the keys that unlock your data’s true value.

This Isn’t An SEO Trick. It Is The Linked Data Architecture Of The Web

To be clear, Enhanced Entity Pages are not some new “hack” designed to game AI search engines. They are a return to the foundational principles of the Linked Data architecture standardized by the World Wide Web Consortium (W3C).

In this model, entities are published as dereferenceable URIs: each entity has a unique web address that resolves to a page representing that entity and exposing its relationships and attributes. This approach has been used for decades by some of the world’s largest knowledge datasets, including DBpedia and Wikidata.

But a dereferenceable URI is more than just a page. A key mechanism of the Linked Data architecture allows the same URI to deliver different representations depending on who is requesting the information.

This mechanism is called content negotiation, a feature defined in the HTTP standard RFC 7231.

One URI, Multiple Representations

In simple terms, content negotiation allows a single web address to speak different “languages.” When a human user visits the page through a browser, the browser requests HTML and the server returns a visual webpage. When a machine client—such as a semantic crawler or an AI agent—requests the same URI, it can ask for a machine-readable format such as RDF or JSON-LD.

The result is one URI, multiple representations, all describing the same entity.

However, true content negotiation for AI is not merely about changing the file format. While some modern tactics simply use this mechanism to serve a stripped-down Markdown version of a web page to an AI, this approach misses the broader potential. The goal is not just to provide a cleaner text format, but to provide semantic context. By serving graph data, you allow the AI to see the underlying entity and its web of connected entities, proving that effective AI communication is a matter of semantics, not just syntax.

This architecture is already operating at a global scale beyond the world of SEO. For example, the standards organization GS1 applies the same concept in its Digital Link standard, where a product identifier (such as a barcode) resolves to a web resource that provides both human-readable and machine-readable information.

The Gateway To Structured Knowledge

In practice, this means a web page is no longer just a document meant to be ranked. It becomes an entry point into structured knowledge about an entity.

For the emerging agentic AI ecosystem, this is crucial. By adopting the Linked Data model, a website can simultaneously provide a rich experience for human visitors and a structured data interface for the AI systems that increasingly mediate discovery.

Before and after: plain HTML (left) vs. enhanced entity page (right) for a sample entity.

What This Means for SEO and GEO Professionals

Our findings point to a massive shift in how we need to think about optimization. We call this new era SEO 3.0: The Reasoning Web.

To understand where we are going, it helps to look at where we have been.

SEO 1.0 was about keywords and documents. You optimized a page so a search engine could read the text and match it to a query.

SEO 2.0 brought us structured markup and rich results. You added JSON-LD so search engines could categorize your content and display it nicely in the SERPs.

SEO 3.0 is about whether an AI system can actually reason over your knowledge.

For modern SEO and Generative Engine Optimization (GEO) practitioners, this evolution translates into three distinct layers of AI visibility that align closely with the stages of agent readiness described here.

The SEO 3.0 Readiness Ladder: From Citations to Autonomous Action

Layer 1: Citations (The “Inventory” Layer)
Will your content be retrieved as a source?
This foundational layer establishes your “inventory presence.” Much like a basic data feed, it ensures an AI system simply knows your offerings exist. Most current GEO (Generative Engine Optimization) strategies are hyper-focused here, treating AI as a traditional search engine that merely needs to rank isolated passages or “chunks.” At this stage, you are visible, but your data remains passive.

Layer 2: Reasoning (The “Operability” Layer)
Can the AI extract and relate your facts accurately?
This is where the Enhanced Entity Page model becomes critical. This stage grants the agent “catalog operability,” allowing it to filter, evaluate, and cross-reference your structured data without the friction of a traditional website crawl. If your knowledge graph is buried in bloated code or becomes “diluted” during retrieval, the AI cannot reason with it. To pass this layer, your content must shift from unstructured text to a machine-readable map of relationships.

Layer 3: Actions (The “Agentic” Layer)
Can an autonomous agent execute a task using your data? We have reached the frontier of agentic search. At the peak of the readiness ladder, the goal is to enable an agent to follow links, traverse your entity graph, and gather the specific context required to complete a mission. By building robust action endpoints, you move the AI from being merely assistive to fully agentic. The agent no longer just recommends a transaction, it has the technical permission and clarity to close it.

While most of the industry is still battling over basic citations, the real competitive advantage lies in building an architecture that supports@ advanced reasoning and autonomous execution.

Why This Matters for Innovation and Marketing Leaders

This shift isn’t just a technical challenge for the SEO team: it is a strategic opportunity for innovation and marketing leaders.

As AI continues to change how customers find and interact with your brand, your knowledge graph is evolving into a core piece of business infrastructure.

In the AI-first world, structured linked data becomes the backbone that connects your products, services, and brand values to the systems that serve them. Your content is no longer just a static page to be read; it is becoming a programmable interface. By exposing entity identifiers and clear relationships, you are building a map that allows machines to explore and understand your brand with absolute precision.

Accuracy As A New Brand Factor

Accuracy in AI-generated answers is also becoming a critical factor for brand trust. Every time an AI system misrepresents your services or misunderstands your product attributes, it directly erodes customer confidence and derails their decision-making process.

Our experiments show that moving to the Enhanced Entity Page model does more than just improve “rankings.” It significantly boosts the completeness and reliability of the answers users receive across multiple industries.

In an age where an AI’s answer is often the first—and only—interaction a customer has with your brand, ensuring that answer is accurate is a non-negotiable marketing priority.

The Return of the Semantic Web

For years, the “Semantic Web” was a beautiful but largely theoretical vision: a world where structured, linked data allowed intelligent systems to understand everything. While the data existed, we lacked the “brain” to process it at scale.

Generative AI has changed that. We finally have the reasoning engine required to make the semantic web practical.

Today’s AI systems are actively rewarding the foundational principles of linked data: clear entity boundaries, explicit relationships, and navigable knowledge. The “Reasoning Web” is no longer an abstract idea; it is the new reality of how visibility works in an AI-driven world.

What Comes Next

The full research paper, “Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval,” (arXiv:2603.10700), submitted on 11 March 2026 is now available . It presents the full experimental design, statistical analysis, dataset, and reproducibility framework behind the study.

For teams working on GEO, AI search visibility, or enterprise content transformation, the implications are clear: this is the time to rethink your architecture.

In the age of agentic search, structured data alone is not enough. To be understood, reused, and retrieved by AI systems, knowledge must also be navigable.

Ready to build your reasoning web?

If you’re looking to transform your website into a structured entity hub that drives accuracy in AI search, we can help. Book a call with our team today to see how we can turn your knowledge graph into a powerful memory layer for the next generation of search.

Credits

We thank Google Cloud for the credits that made this research possible, the WordLift engineering team for maintaining the knowledge graph infrastructure used in the study, and our clients for continuously inspiring and grounding our work in real-world challenges.