LLM Structure

What Can AI Access?

Before a language model can understand or recommend a site, it has to read it. AI does not read the way a browser does. It does not render your layout, execute your JavaScript, or interpret your design. It reads raw markup — headings, lists, schema, semantic HTML.

Most websites are built for humans looking at screens. Language models are not looking at screens. They are parsing text. If your site communicates meaning through visual design instead of semantic structure, a model cannot extract it. The information is there for a person, but it is not there for the machine.

Structure is not an optimization. It is a prerequisite. If a site is not machine-readable, it does not exist to AI.

Read All Articles View Live Data

What Structure Looks Like

Four patterns determine whether a language model can parse a site accurately:

Semantic Markup
HTML that communicates meaning, not just layout. Headings, lists, and sections that map to the content's logical structure.
Entity Definitions
JSON-LD and schema that tell a model exactly what an organization is, does, and where it operates — without relying on inference.
Content Segmentation
Clear topic boundaries so a model can associate claims with the right context instead of flattening everything into noise.
Token Efficiency
Pages that deliver a complete answer without forcing the model to process unnecessary content. Fewer tokens, same signal.

Active Research

We are running controlled experiments on how structural patterns affect the way language models parse and represent websites. Current areas of investigation:

Schema as Signal
Whether JSON-LD schema functions as a direct communication channel to AI systems — and what models actually extract from it.
Token Cost
Whether leaner pages with higher signal density have a measurable structural advantage in AI retrieval and citation.
Heading Hierarchies
How heading structure and content segmentation affect a model's ability to attribute claims to the correct context.

Findings will be published here as the data matures. Current publications are on the Research page.