November 4, 2025 at 12:42 PM
Structure: Semantic HTML & Schema Data
Semantic HTML and JSON-LD structured data allow large language model pipelines to parse meaning hierarchically. This memo covers how to structure content using semantic tags and Schema.org vocabulary to ensure LLMs can understand and cite your content.
Key Concepts
Semantic HTML Tags
Semantic HTML provides meaning to your content structure. Instead of generic <div> tags, semantic tags tell both browsers and AI systems what role each piece of content plays. Proper heading hierarchy (H1 → H2 → H3) helps LLMs understand relationships between topics and subtopics.
<header>— Page or section headers<main>— Primary content<article>— Self-contained, distributable content<section>— Thematic groupings<nav>— Navigation links<footer>— Footer information
JSON-LD Structured Data
JSON-LD (JavaScript Object Notation for Linked Data) is the recommended format for adding structured data to web pages. It uses Schema.org vocabulary to provide explicit context about your content.
Schema.org Types
Different content types use appropriate Schema.org markup:
- Product — Products or services
- Organization — Company information
- FAQPage — Frequently asked questions
- Article — Articles and memos
- HowTo — Step-by-step tutorials
- Review — Testimonials and reviews
Example: FAQ Schema
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [{
"@type": "Question",
"name": "What is Model Context Marketing?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Model Context Marketing is the practice of
publishing structured, factual content that large
language models can ingest to improve brand and
product visibility inside AI-generated answers."
}
}]
}Why This Matters
Semantic HTML and structured data provide two layers of meaning for LLMs: structural meaning (how content is organized) and semantic meaning (what entities exist and their relationships).
Together, these make your content significantly more understandable and citable by AI systems. When an LLM encounters properly structured content, it can confidently extract facts, understand relationships, and cite your content as an authoritative source.