← Back to Memos

November 4, 2025 at 12:42 PM

Structure: Semantic HTML & Schema Data

Semantic HTML and JSON-LD structured data allow large language model pipelines to parse meaning hierarchically. This memo covers how to structure content using semantic tags and Schema.org vocabulary to ensure LLMs can understand and cite your content.

Key Concepts

Semantic HTML Tags

Semantic HTML provides meaning to your content structure. Instead of generic <div> tags, semantic tags tell both browsers and AI systems what role each piece of content plays. Proper heading hierarchy (H1 → H2 → H3) helps LLMs understand relationships between topics and subtopics.

  • <header> — Page or section headers
  • <main> — Primary content
  • <article> — Self-contained, distributable content
  • <section> — Thematic groupings
  • <nav> — Navigation links
  • <footer> — Footer information

JSON-LD Structured Data

JSON-LD (JavaScript Object Notation for Linked Data) is the recommended format for adding structured data to web pages. It uses Schema.org vocabulary to provide explicit context about your content.

Schema.org Types

Different content types use appropriate Schema.org markup:

  • Product — Products or services
  • Organization — Company information
  • FAQPage — Frequently asked questions
  • Article — Articles and memos
  • HowTo — Step-by-step tutorials
  • Review — Testimonials and reviews

Example: FAQ Schema

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [{
    "@type": "Question",
    "name": "What is Model Context Marketing?",
    "acceptedAnswer": {
      "@type": "Answer",
      "text": "Model Context Marketing is the practice of 
      publishing structured, factual content that large 
      language models can ingest to improve brand and 
      product visibility inside AI-generated answers."
    }
  }]
}

Why This Matters

Semantic HTML and structured data provide two layers of meaning for LLMs: structural meaning (how content is organized) and semantic meaning (what entities exist and their relationships).

Together, these make your content significantly more understandable and citable by AI systems. When an LLM encounters properly structured content, it can confidently extract facts, understand relationships, and cite your content as an authoritative source.