Foundation: SEO & Crawler Infrastructure | Model Context Marketing

Content needs to be visible to all crawlers. Bots don't read JavaScript well, so server-side rendering is essential. This memo covers the foundational technical setup to ensure the site is properly structured for ChatGPT, Claude, Perplexity, DeepSeek, Google Extended, and other large language models.

Implementation

robots.txt

Explicitly allows all crawlers with special attention to LLM-specific bots including ChatGPT-User, GPTBot, Claude-Web, CCBot, anthropic-ai, PerplexityBot, and Google-Extended.

Dynamic Sitemap

XML sitemap generation with last-modified dates, change frequency, and priority levels. Helps crawlers understand content importance and freshness.

Canonical URLs

Prevents duplicate content issues and ensures LLMs recognize the authoritative source for each page.

Enhanced Metadata

Comprehensive metadata including Open Graph tags, Twitter Card metadata, keywords targeting AI/LLM marketing, author and publisher attribution, and robot directives for optimal crawling.

JSON-LD Structured Data

Schema.org WebSite and Article markup provides machine-readable context about the site's purpose and content.

Why This Matters

LLMs must be able to crawl, parse, and understand your content. Without proper technical infrastructure, even the best content remains invisible. Server-side rendering ensures all text is in the HTML—not hidden behind JavaScript. Structured data helps LLMs understand what your content is about and how to categorize it.