ChatGPT is the slowest of the eight engines to surface you — and the most strategically important. It indexes through a different door than Perplexity. Here's how to walk in.
Most AEO advice treats ChatGPT as a single engine. It's actually two retrieval modes layered together, and they reward different work.
Door 1: training data. The base model knows what it knows because someone trained it on a snapshot of the public internet. If your brand isn't in the snapshot, ChatGPT can't name you in offline-mode answers — no amount of new content fixes a base model that's already trained. Inclusion happens at training time, and training cycles run roughly every 12-18 months.
Door 2: live retrieval. When ChatGPT browses (the default for paid users in 2026), it makes a real search call via OAI-SearchBot and uses fresh web results. This is the door you can work on today.
Most AEO Owl users see ChatGPT as their lowest-scoring engine precisely because they're stuck behind door 1 — the base model has no record of them — and they haven't optimized for door 2. The good news: door 2 is unblocking-grade work, and improvements compound into door 1 over time.
ChatGPT's training pipeline weights certain sources extraordinarily heavily for entity grounding: Wikipedia, Crunchbase, IMDB, official corporate sites, and major news archives. If your company has a Wikipedia article, a Crunchbase profile with funding history, and a press archive of named coverage, ChatGPT knows who you are. If it has none of those, you don't exist to the base model.
When ChatGPT generates an answer offline, it samples from its training corpus weighted by source authority. A blog post on your own domain might be in the training data, but it carries less weight than a Wikipedia paragraph or a Crunchbase fact-line. The structured, third-party sources are where ChatGPT actually pulls "facts" from at generation time.
OpenAI publishes two separate crawler user-agents and most sites block one or both by accident. Both must be allowed for ChatGPT to find you through door 2 (live retrieval).
GPTBot — the crawler that fetches pages for training the next model version.OAI-SearchBot — the crawler that ChatGPT calls in real-time when browsing.ChatGPT-User — the user-agent used when a user explicitly asks ChatGPT to fetch a URL.Your robots.txt should explicitly allow all three:
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
Verify with curl:
curl -A "GPTBot/1.0" https://yoursite.com/
curl -A "OAI-SearchBot/1.0" https://yoursite.com/
Both must return HTTP 200 with the actual content. If you're behind Cloudflare, check Security → Bots → AI Crawlers and ensure the "Block AI Bots" super-toggle is OFF.
ChatGPT's training corpus over-indexes on a handful of platforms because they have structured, high-signal user contributions. Reddit, Stack Overflow, Hacker News, Quora, and LinkedIn are the big five for B2B. The pattern that wins is being discussed on these platforms — not posting from your brand account, but earning organic mentions from third parties.
Tactical plays:
ChatGPT is the longest-horizon engine in the audit. You won't see citation rate move in 30 days; expect 60-90.
ChatGPT rewards presence over time, not bursts. Pick the cadence you can hold for a year and run it.
Back to Engine Performance · All Engine Playbooks