What to actually do to improve your AEO/GEO standing — a practical playbook

PublishedJune 16, 2026 · UpdatedJune 19, 2026 · Quratic Team · 13 min read

Build a 20–30 prompt library, log weekly across ChatGPT, Perplexity, and Gemini, and know what to fix. Free spreadsheet method plus benchmarks and industry templates.

Most teams know they should “do GEO.” Few have a repeatable system for measuring it. Answer Engine Optimization (AEO) and Generative Engine Optimization (GEO) are not mysterious — they are a discipline of running the right prompts, logging the right signals, and acting on gaps. You can start for free with a spreadsheet and three AI tabs open.

This playbook is the operational version of our GEO strategy guide: what prompts to write, what to log, what benchmarks to aim for, and what to fix when you are invisible.

The free starting point: 20–30 prompts, three platforms, one spreadsheet

Before you buy a tool, prove the workflow manually:

Build a prompt library of 20–30 queries across three intent categories (below)
Run each prompt weekly in ChatGPT, Perplexity, and Gemini
Log results in a spreadsheet — one row per prompt × platform × week
Review monthly for patterns: which queries you win, which competitors dominate, where citations are missing

Expect 2–3 hours per weekly cycle for 25 prompts × 3 platforms if you work efficiently. That is enough to establish a baseline and justify (or avoid) paid tooling.

Assign each prompt to a target country and run it from that context — a Singapore buyer’s answer differs from a US default. Browser-based collection from local IPs matters; a VPN spot-check is better than nothing but not a substitute for scheduled local runs.

Three prompt categories (and why each matters)

Organise every prompt into one of three buckets. Each bucket tells you something different.

1. Branded / direct — your sentiment baseline

These queries assume the buyer already knows your name. They reveal how AI describes and frames you — accuracy, tone, and positioning.

Templates:

“What is [Brand]?”
“Is [Brand] good for [use case]?”
“What do people think of [Brand]?”
“Is [Brand] legit and reliable for [specific context]?”

What you learn: Factual errors, outdated product descriptions, negative framing, missing differentiators. Fix these before investing in category discovery — if AI misdescribes you on branded queries, category prompts will inherit the same problems.

Benchmark: Strong brands should appear on 90%+ of branded prompts across platforms. Below 70% signals entity confusion or weak third-party validation.

2. Category / non-branded — highest discovery value

These are queries from buyers who do not know you yet. This is where GEO wins pipeline.

Templates:

“What’s the best [category] for [specific use case / audience]?”
“I need a [product/service] that does [specific need] — what do you recommend?”
“What should I look for when choosing a [category]?”
“Best [category] in [Singapore / Japan / city] for [audience]?”

What you learn: Whether you appear in the consideration set at all. Most brands are invisible here initially — that is normal. The goal is measurable improvement month over month.

Benchmark: WebFX’s GEO benchmarks cite 15–25% visibility on tracked queries as “good,” with 30–50%+ as strong performance on informational/category queries. Discovered Labs notes market leaders often exceed 30% citation rate on core category queries.

Aim for 15–25% mention rate on your core category prompts as an initial target. Competitive categories with sustained effort often reach 30%+.

3. Comparison — competitive positioning

These queries surface you alongside named alternatives. They drive shortlist decisions.

Templates:

“[Brand] vs [Competitor]”
“What are alternatives to [Competitor]?”
“Which is better for [use case], [Brand] or [Competitor]?”
“Compare [Brand] and [Competitor] for [specific need]”

What you learn: Whether AI recommends you, positions you as second choice, or omits you entirely when a competitor is named. Also surfaces which competitors AI treats as category defaults.

Benchmark: Track share of voice — your mentions ÷ total brand mentions in comparison prompts. Above 25% SOV on comparison queries is a healthy starting point in contested categories.

What to log on every run

Create these columns in your spreadsheet:

Column	What to record	Why it matters
Date	Week of run	Trend over time
Prompt	Exact query text	Reproducibility
Platform	ChatGPT / Perplexity / Gemini	Platforms diverge
Country	SG, JP, KR, etc.	Answers vary by market
Mentioned?	Y / N	Core visibility metric
Cited with link?	Y / N / N/A	Mention ≠ citation
Position	1st, 2nd, 3rd named, or “listed”	Prominence in answer
Sentiment	Positive / neutral / negative / mixed	How AI frames you
Competitors named	List	SOV denominator
Sources cited	URLs if shown	PR/content targets

Mention vs citation: do not merge them

They are different signals:

Mention — AI names your brand in the answer text, often without a link
Citation — AI links to a specific URL as a source

Perplexity almost always cites with numbered links. ChatGPT often mentions brands without linking — especially on browse-enabled queries where it synthesises rather than footnotes.

Log both separately. A brand with high mentions but zero citations may have awareness without click path. A brand with citations but no mentions on category prompts may only appear as a footnote on third-party listicles.

Spreadsheet template: starter rows

Copy this structure into Google Sheets or Notion:

Sheet 1 — Prompt library

ID	Category	Prompt template	Filled example	Country	Priority
B1	Branded	What is [Brand]?	What is Quratic?	SG	High
C1	Category	Best [category] for [audience] in [country]?	Best AI visibility tool for marketing teams in Singapore?	SG	High
X1	Comparison	[Brand] vs [Competitor]	Quratic vs Profound	SG	High

Sheet 2 — Weekly log

One row per prompt × platform × week with the logging columns above.

Sheet 3 — Summary dashboard

Pivot or manual counts:

Mention rate by category (% mentioned / total runs)
Citation rate by category
SOV on comparison prompts
Week-over-week delta

Industry prompt packs (fill in your brand and market)

Adapt these for your vertical. Replace [country] with your primary market — Singapore, Indonesia, Japan, etc.

E-commerce and retail

Type	Example prompt
Branded	”Is [Brand] legit and reliable for online shopping in [country]?”
Category	”What’s the best online shopping platform for [electronics/fashion] in [Singapore/Indonesia]?”
Category	”Where can I buy [product type] online with fast delivery in [city]?”
Comparison	”[Brand] vs [Competitor] — which is better for [fast delivery / returns / pricing]?”
Comparison	”What are good alternatives to [Competitor] in [country]?”

Banking and fintech

Type	Example prompt
Branded	”Is [Brand] safe and trustworthy for digital banking / payments?”
Category	”What’s the best digital bank or e-wallet in [country] for [freelancers / SMEs / students]?”
Category	”What’s the cheapest way to send money from [country A] to [country B]?”
Comparison	”[Brand] vs [Competitor] for lower fees / better savings rates?”

Travel and hospitality

Type	Example prompt
Branded	”What is [Brand] known for as a hotel / travel brand?”
Category	”Best budget-friendly hotels in [city] for [families / solo travellers]?”
Category	”Best flight booking app for travel within [Southeast Asia]?”
Comparison	”[Brand] vs [Competitor] for a trip to [city]?”

Healthcare and wellness

Type	Example prompt
Branded	”Is [Brand] a trusted clinic / telehealth provider in [country]?”
Category	”Best telehealth app in [country] for [mental health / general consultations]?”
Category	”Where can I find a good [dermatologist / dentist] in [city]?”
Comparison	”[Brand] vs [Competitor] for [specific service]?”

Real estate and property

Type	Example prompt
Branded	”What is [Brand] known for in the property market?”
Category	”Best platform to find a rental apartment in [city]?”
Category	”How do I find a reliable real estate agent in [city]?”
Comparison	”[Brand] vs [Competitor] for buying / renting property in [city]?”

B2B SaaS (add-on pack)

Type	Example prompt
Branded	”What is [Brand] and who is it for?”
Category	”Best [category] software for [SMEs / enterprises] in [country]?”
Category	”I need a tool that [specific job-to-be-done] — what do you recommend?”
Comparison	”[Brand] vs [Competitor] for [use case]?”

Run each filled prompt through ChatGPT, Perplexity, and Gemini. Log mention, citation, position, sentiment, and competitors every week.

What the numbers mean — and what to do next

Once you have four weeks of data, read the patterns:

Mention rate below 15% on category prompts

Diagnosis: Invisible to discovery queries.

Fixes (in order):

Map which domains AI cites instead of you — run 10 prompts and record every source URL
Publish or update comparison pages and “best X for Y” content with direct answers in the first 100 words
Pursue third-party listicles — listicles are cited 3–5× more often than owned service pages for recommendation prompts
Add FAQPage and Organization schema — structured markup correlates with higher citation rates (Averi benchmarks: +15–30% from clear H2/H3 structure)

High mentions, low citations

Diagnosis: AI knows your name but does not link to you.

Fixes:

Improve page extractability — TL;DR blocks, dated publish metadata, citation-friendly statistics
Publish original data or benchmarks — Passionfruit’s research finds original research cited at 38–65% vs 6–15% for standard blog posts
Check robots.txt — retrieval bots may be blocked at CDN level

Negative or inaccurate sentiment on branded prompts

Diagnosis: Entity confusion or outdated third-party narrative.

Fixes:

Correct factual errors on your About, product, and docs pages
Strengthen Wikipedia/Wikidata/Crunchbase/G2 profiles if applicable
Publish recent case studies and press that AI can retrieve
Do not argue with AI — fix the source material it reads

Strong on ChatGPT, weak on Perplexity (or vice versa)

Diagnosis: Platform-specific retrieval gaps — expected, not a failure.

Fixes: Each platform uses different sources. Optimise for the platform where your buyers actually research — and track all three rather than averaging.

Competitor dominates comparison prompts

Diagnosis: AI treats them as category default.

Fixes:

Dedicated “[You] vs [Them]” pages with fair, factual comparison tables
Earn mentions on review sites where the competitor already appears
Track whether competitor advantage is Google rank, AI Overview, or AI answer-only — fix the right layer

Benchmarks to report to leadership

Use these ranges when setting expectations (WebFX, Discovered Labs, Topify):

Metric	Starting point	Good	Strong
Category mention rate	0–10%	15–25%	30–50%+
Branded mention rate	70%+ target	90%+	95%+
Comparison SOV	Track relative	25%+	40%+
Citation rate (linked)	Lower than mention	15–25%	30%+
Platforms covered	1	3	3+ with Google AI Mode

Frame improvement as month-over-month delta, not absolute perfection. Passionfruit’s 11.2M citation study found 68% of query citations disappear month-to-month — consistency of measurement matters more than any single week’s snapshot.

When to move beyond the spreadsheet

Manual tracking breaks down around 30+ prompts × 3 platforms × weekly cadence — roughly 90+ runs per week before accounting for copy-paste and sentiment scoring.

Free starting points worth trying:

Ahrefs’ free AI visibility checker — batch prompts against your brand automatically
HubSpot’s AI Search Grader — automated brand visibility scan

Both are useful for a one-time baseline. They typically lack country-level collection, competitor SOV over time, and scheduled refresh — the gaps that matter for Asian markets.

Paid continuous monitoring (when manual cost exceeds tool cost):

Global platforms: Profound, Peec, Otterly
Asia-focused: Quratic — browser collection across ChatGPT, Perplexity, Google AI Mode, Gemini, and Google Rankings in SG, JP, KR, MY, ID, HK

The decision rule: if you are still acting on spreadsheet data and leadership asks for weekly SOV, upgrade. If you are not logging consistently yet, a paid tool will not fix the discipline problem.

90-day improvement loop

Phase	Weeks	Focus
Baseline	1–4	Build prompt library, log weekly, no content changes yet
Diagnose	5–6	Identify top 5 invisible category prompts and top 3 cited competitor domains
Fix	7–10	Content refresh, comparison pages, third-party outreach, schema
Measure	11–12	Compare mention rate and SOV to baseline; report delta to leadership

Discovered Labs cites 40–60% improvement in citation frequency within 3–6 months for teams executing systematically — realistic if you fix sources AI already trusts, not only your homepage.

FAQ

How many prompts do I really need?

20–30 to start — enough coverage without drowning in manual work. Expand when the first set drives decisions. Passionfruit’s benchmark protocols often use ~30 prompts across platforms for comparable audits.

ChatGPT, Perplexity, Gemini — is that enough?

Yes for a baseline. Add Google AI Mode / Overviews if your buyers use Google for category research. Add Copilot for enterprise B2B. Platform guide.

Should prompts be in English or local language?

Match your buyer. English for Singapore business queries; Japanese for Tokyo B2B; Bahasa for Indonesia consumer. Same intent, different language = different answers.

Is mention rate the same as citation rate?

No. Report both. Mention rate = brand visibility. Citation rate = linked source visibility. Perplexity-heavy strategies skew citation; ChatGPT-heavy strategies skew mention.

Can I improve GEO without creating new content?

Partially. Fixing structured data, refreshing existing pages, earning third-party mentions, and correcting entity profiles can move numbers without a content sprint. Category discovery prompts usually require new or substantially updated comparison and listicle-aligned content.

How does this connect to Google rank?

GEO and SEO overlap but differ. Track both in a split-screen report — organic rank, AI Overview ownership, and AI mention rate on the same intents.

Skip the spreadsheet setup — start a free Quratic trial with scheduled prompt runs across six Asian markets, or use this playbook to baseline manually first.

What to actually do to improve your AEO/GEO standing — a practical playbook

The free starting point: 20–30 prompts, three platforms, one spreadsheet

Three prompt categories (and why each matters)

1. Branded / direct — your sentiment baseline

2. Category / non-branded — highest discovery value

3. Comparison — competitive positioning

What to log on every run

Mention vs citation: do not merge them

Spreadsheet template: starter rows

Industry prompt packs (fill in your brand and market)

E-commerce and retail

Banking and fintech

Travel and hospitality

Healthcare and wellness

Real estate and property

B2B SaaS (add-on pack)

What the numbers mean — and what to do next

Mention rate below 15% on category prompts

High mentions, low citations

Negative or inaccurate sentiment on branded prompts

Strong on ChatGPT, weak on Perplexity (or vice versa)

Competitor dominates comparison prompts

Benchmarks to report to leadership

When to move beyond the spreadsheet

90-day improvement loop

FAQ

How many prompts do I really need?

ChatGPT, Perplexity, Gemini — is that enough?

Should prompts be in English or local language?

Is mention rate the same as citation rate?

Can I improve GEO without creating new content?

How does this connect to Google rank?

How our data is collected

Open a real browser session

Route through a local residential IP

Capture and score the response

Stay ahead on AI visibility in Asia