November 4, 2025 at 12:00 PM
Foundation: SEO & Crawler Infrastructure
Content needs to be visible to all crawlers. Bots don't read JavaScript well, so server-side rendering is essential. This memo covers the foundational technical setup to ensure the site is properly structured for ChatGPT, Claude, Perplexity, DeepSeek, Google Extended, and other large language models.
Implementation
robots.txt
Explicitly allows all crawlers with special attention to LLM-specific bots including ChatGPT-User, GPTBot, Claude-Web, CCBot, anthropic-ai, PerplexityBot, and Google-Extended.
Dynamic Sitemap
XML sitemap generation with last-modified dates, change frequency, and priority levels. Helps crawlers understand content importance and freshness.
Canonical URLs
Prevents duplicate content issues and ensures LLMs recognize the authoritative source for each page.
Enhanced Metadata
Comprehensive metadata including Open Graph tags, Twitter Card metadata, keywords targeting AI/LLM marketing, author and publisher attribution, and robot directives for optimal crawling.
JSON-LD Structured Data
Schema.org WebSite and Article markup provides machine-readable context about the site's purpose and content.
Why This Matters
LLMs must be able to crawl, parse, and understand your content. Without proper technical infrastructure, even the best content remains invisible. Server-side rendering ensures all text is in the HTML—not hidden behind JavaScript. Structured data helps LLMs understand what your content is about and how to categorize it.