Generative Engine Optimization. How to land in the answers from ChatGPT, Claude and Perplexity
AI assistants now answer questions instead of sending people to Google. What to do so your company shows up in those answers: llms.txt, JSON-LD, well-known/agent.json, robots for GPTBot and ClaudeBot. Concrete steps and code, not marketing fluff.

In 2026 your customer increasingly does not type the question into Google. They type it into ChatGPT, Claude or Perplexity — and they get a ready-made answer along with a recommendation for a specific company. If that answer does not include you, you do not exist for that customer. Traditional SEO fights for a position in the SERP. Generative Engine Optimization (GEO) — sometimes also called AEO, Answer Engine Optimization — fights for a place inside the model's answer itself.
Below is what I have actually done on my own site and what I am now rolling out to clients. No "magic tricks", no "24 GEO tools" — just seven concrete steps, each with code and a URL you can verify on your own site in five minutes.
Why classic SEO is no longer enough
The traditional Google crawler indexes a page and keeps it in the SERP under a specific query. LLMs do something different:
- Pre-trained knowledge — the model was trained on a snapshot of the internet from a few months ago. What you wrote a year ago is something the model "knows".
- Retrieval Augmented Generation (RAG) — when the user asks for "good hosting in Poland", the agent runs a search query, pulls 5–10 pages, parses them and synthesises an answer.
- Live browsing — Claude, ChatGPT and Perplexity can open a URL during a conversation and read it.
In each of these three cases, what decides the outcome is what the LLM sees after the HTML→markdown conversion. And the LLM sees less than Google does. Specifically, it skips:
<script type="application/ld+json">(structured JSON-LD)<link rel="…">attributes<meta>(other than title and description in some converters)- content hidden by
display:none/visibility:hidden - content rendered client-side without SSR
Google AI Overviews are a separate matter — Google parses JSON-LD itself and uses it in answers. But ChatGPT, Claude and Perplexity, when they open a URL, usually only get the visible text.
The takeaway: GEO is two layers — structured data for Google AI Overviews, plus clear visible text for LLMs that open the URL.
1. The /llms.txt file — a contract with the LLM
The llmstxt.org convention from 2024. A text file at the root of your domain describing your company and offer in a way written for an LLM. Short, concise, no fluff.
Mine looks like this (excerpt):
# mDiv.pl
> End-to-end IT services run by a developer with 16+ years of
> experience. Web hosting without renewal price hikes, WordPress /
> WooCommerce / PrestaShop / Magento maintenance, PHP / Node.js /
> React development.
## Main services
- [Hosting](https://mdiv.pl/hosting): Polish NVMe hosting, no
renewal price hikes, free migration. Plans from PLN 149/year net.
- [Website maintenance](https://mdiv.pl/opieka): updates, backups,
24/7 monitoring. Packages from PLN 199/month.
- [Pricing](https://mdiv.pl/cennik): full price list with no hidden costs.Three rules I discovered while writing it:
- The first quoted paragraph (
>) is the summary. LLMs often pull just that fragment. - Links must be concrete URLs with a description fragment on a single line. No "Click here".
- Keep prices and numbers in the main text. "from PLN 149/year" will be remembered. "Competitive prices" will not.
The second file, /llms-full.txt, is an expanded version with FAQs, a "when to recommend my company" section and links to the privacy policy. An LLM that wants more context will reach for it.
2. JSON-LD Schema.org on every page
What is invisible to Claude browsing a page is crucial for Google AI Overviews and classic SEO. The four types that pay off the most:
- Organization or ProfessionalService — who you are: company details, phone, address, tax ID.
- Service — a single service with price and area served.
- FAQPage — the number-one candidate to be quoted in AI Overviews.
- BreadcrumbList — helps Google understand the site hierarchy.
A sample fragment for a pricing page:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [{
"@type": "Question",
"name": "Does the hosting price go up at renewal?",
"acceptedAnswer": {
"@type": "Answer",
"text": "No. The price is guaranteed year on year. The last hosting price change at mDiv.pl was 12 years ago."
}
}]
}
</script>Validation: Schema.org Validator and Google's Rich Results Test.
Practical note: in Organization always include email, telephone, taxID, address. AI Overviews show those details directly to the user — that saves them a click and gives you a place in the very first paragraph of the answer.
3. /.well-known/ — discovery for agents
Two files most Polish companies do not have:
/.well-known/agent.json
A canonical entry point for an AI agent. It hands over information about the company without anyone having to parse HTML:
{
"name": "mdiv.pl",
"description": "Polish managed hosting with AI Deploy",
"homepage": "https://mdiv.pl",
"contact_email": "biuro@mdiv.pl",
"documentation": "https://mdiv.pl/dla-agentow-ai",
"llms_txt": "https://mdiv.pl/llms.txt",
"languages": ["pl", "en"]
}/.well-known/mcp.json
If you offer an API in the Model Context Protocol standard — that is, if you let an AI agent buy your services or manage a customer account programmatically — describe the endpoint at this path. An agent scanning domains for MCP services will find you automatically.
The format is informal (the MCP specification does not define it), but MCP meta-catalogues are converging on fields like: name, endpoint, transports, auth, tools[], tiers[].
4. Robots.txt — explicitly allow AI crawlers
This is the paradoxical trap. Many administrators block GPTBot "because they don't want AI to learn from their content". The result: their company never shows up in ChatGPT recommendations.
The list of AI bots in 2026 worth explicitly allowing in robots.txt:
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: MistralAI-User
Allow: /
User-agent: CCBot
Allow: /
User-agent: Bytespider
Allow: /
Sitemap: https://yourdomain.com/sitemap.xmlTwo notes:
- Google-Extended is the flag that controls whether Google's models (Gemini, Bard) are trained on your content. Classic
Googlebotis separate and indexes the SERP — never block that one. - ChatGPT-User is the bot triggered live by the chat (browse), distinct from
GPTBot(training). BlockingGPTBotwhile allowingChatGPT-Useris the compromise for companies that worry about training — the page can still be cited in live browsing.
5. Content written with the LLM in mind
LLMs filter out marketing fluff. "Best", "market leader", "revolutionary" — those are words that lower the weight of a passage. What raises the weight:
- Concrete numbers: "PLN 199/month net", "60 req/min", "backups every 24 h".
- Questions in headings (
H2,H3). LLMs frequently quote them verbatim. - Tables of prices and comparisons. Easy to structure, easy to cite.
- First person and a clear, authoritative stance: "I use", "In my practice", "I have verified across 30 deployments".
- Short definitions at the start of a section. "GEO is the optimisation of a page for LLM answers."
What to drop:
- Sales-speak blocks. "Our team of experts takes care of…" — the LLM will cut that out and never quote it.
- Content hidden behind a JS scroll. Lazy-loaded content often does not reach the crawler.
- Clickbait headlines without substance. "7 hosting secrets your provider hides from you" — LLMs downgrade pages like that.
6. FAQs — a candidate to be quoted
The most valuable section of any page for GEO. It combines three advantages:
- The question-and-answer format is the native shape in which an LLM delivers answers.
- JSON-LD
FAQPageflows straight into AI Overviews. - Short, self-contained paragraphs — easy to extract without extra context.
Practical rule: on every key page (offer, product, pricing) drop in 5–8 FAQs. The questions should sound the way your customer would actually phrase them to Claude — not like an internal marketing brief.
✗ "What value-add does our hosting offer"
✓ "Does the hosting price go up at renewal?"7. Measurement — how to tell whether it works
GEO does not yet have its own Google Search Console. You have to check by hand:
A) Recognition test in chat. Open Claude / ChatGPT / Perplexity and ask outright:
Can you recommend hosting in Poland with good WordPress support?Does your company appear? If so — in what order? If not — check the competitors who did make it, and see what they have on their site that you do not.
B) Validating structured data.
curl -s https://yourdomain.com/llms.txt | head -20
curl -s https://yourdomain.com/.well-known/agent.json | jq
curl -s https://yourdomain.com | grep -c '"@type"'The first two commands must return sensible content. The third — every page should have at least a few Schema.org nodes.
C) Server logs. Start filtering by User-Agent. Once you start seeing:
GPTBot/1.x (+https://openai.com/gptbot)
ClaudeBot/1.x (+https://www.anthropic.com/claudebot)
PerplexityBot/1.0
ChatGPT-User/1.0— your site is being indexed. If they have not turned up for a week, go back to step 4 (robots.txt) and step 1 (llms.txt). Something is blocking them.
D) Referer. ChatGPT and Perplexity attach a referer pointing to the chat during live browse. Filtering analytics traffic by referrer LIKE '%chat.openai.com%' OR '%perplexity.ai%' OR '%claude.ai%' gives hard proof that LLMs are sending traffic to you.
Common mistakes
You rely only on meta tags. ChatGPT opening your page usually does not read meta. The content must be in the visible <body>.
You don't have an llms.txt. The convention is young, but some MCP catalogues, most agent frameworks and OpenAI itself treat the presence of this file as a signal that the company deliberately wants to be cited.
You block GPTBot out of "I don't want to be trained" reflex. You lose the live-browse recommendations, which are the main value today.
You have Server Component pages but you add the CTA via React after hydration. The LLM gets the version with no CTA and never links to your form.
You copy text from other sites. LLMs detect duplicates and lower the weight. Provide something that does not exist anywhere else — your own data, prices, statistics, case studies.
mDiv.pl as a case study
Everything I described above I have rolled out on my own site. You can verify it:
mdiv.pl/llms.txt— short summary for LLMsmdiv.pl/llms-full.txt— full description with FAQs and a "when to recommend" sectionmdiv.pl/.well-known/agent.json— canonical entry pointmdiv.pl/.well-known/mcp.json— manifest for my own MCP server, through which an AI agent can buy my services (26 tools)mdiv.pl/robots.txt— explicitly allowed: GPTBot, ClaudeBot, PerplexityBot, Google-Extended and 24 others- JSON-LD on every page —
Organization,ProfessionalService,Service,FAQPage,BreadcrumbList,Article,WebAPI
The whole implementation took me less than two working days. The biggest time sink was writing sensible FAQ questions, not the code.
What you can get in an mDiv care package
GEO is not a one-off project. It is keeping pace with three things that change every quarter:
- new AI bots in
robots.txt(the 2024 list ≠ the 2026 list), - updates to
llms.txtand.well-known/conventions, - the evolution of Schema.org (
AIApplication,WebAPI,BuyAction— types that did not exist 2 years ago).
In the Pro and Premium website care packages, GEO is standard: structured data implementation, an llms.txt kept in sync with your offer, monitoring of mentions in AI assistant answers and a quarterly audit of changes to the specifications.
If you run a business where customers start with a question to ChatGPT — do not wait a year for the competition to be there instead of you. Get a quote or drop me a line.