When ChatGPT searches the web to answer a prompt, it doesn’t just run one search. It breaks the prompt into 2–4 “fan-out queries” — smaller, more specific searches it uses to pull information from across the web.
In the marketing community, there’s a growing belief that if you rank for these fan-out queries, you’ll consistently get cited by LLMs.
But we wanted to verify and quantify this: How much overlap is there between fan-out query rankings and the sources ChatGPT actually cites? Do the sources cited in a ChatGPT response rank on Google for the fan-out queries? If so, how highly do they rank?
We focused on ChatGPT because, across dozens of clients, it drives the most referred traffic.
We looked at 100 different buying-intent prompts, each generating 2-4 fan-out queries, and compared the sources ChatGPT cited against the Google and Bing search results for each fan-out query to see how much overlap there actually was.
In total, only 27% of cited sources ranked on Google for the fan-out queries used by ChatGPT. Bing was slightly lower at 23%. Accounting for the ~10% overlap between Google and Bing results, only about 40% of cited sources appeared somewhere in the first 10 pages of either search engine for those queries.
That means, on average, more than half (60%) of cited sources in ChatGPT answers don’t rank in the first 10 pages of Google or Bing for the fan-out queries.
We also observed a wide range: for some prompts, there was 100% overlap between citations and fan-out query rankings. For others, none of the cited sources ranked within the first 10 pages for those queries.
This data doesn’t invalidate the connection between traditional search rankings and AI search visibility. According to our study, ChatGPT still searches the web for 80%+ of product-related prompts. And as we’ve written about before, these bottom-of-funnel prompts are where AI visibility matters most.
But it adds to a growing truth about AI search: LLMs are fundamentally less predictable and less “hackable” than traditional SEO.
More specifically, trying to produce single pieces of content to “rank” for specific AI search prompts, in this case by targeting fan-out queries, doesn’t make sense for AI search. It’s fuzzier and less direct than that.
As a result, brands are better off doing authentic marketing — including publishing extremely detailed and thorough content about their products, use cases, features, and benefits — rather than trying to “hack” their way into ChatGPT by targeting specific fan out queries, or as we’ve talked about in previous articles, with tactics like adding FAQ pages, bullet point summaries, or llms.txt files.
You don’t know the true prompts your users are entering. And even if you did, this study shows that targeting fan-out queries gets you cited less than half of the time. So instead of a content strategy aimed at specific prompts, focusing on product-centric content that covers relevant topic areas makes more sense.
Below, we’ll explore these implications further, along with the obvious next question: if only 40% of citations come from the fan-out query SERPs, where does ChatGPT get the remaining 60%?
A Note on Methodology
If you’ve been following us, you’ll know that we believe bottom-of-funnel, product-centric prompts are the only ones worth worrying about for AI search. Top-of-funnel queries are typically answered by LLMs without mentioning or linking to brands or websites.
All 100 prompts we tested had buying intent (e.g., “best help desk software for small teams,” “top accounts payable automation tools,” or “best standing desk for home office”). The prompts covered B2B, B2C, software, services, and physical products.
We used the browser UI, not the API, so the data would reflect what a real user sees.
Of the 100 prompts, 83 triggered a web search, meaning 17% of the time ChatGPT answered without searching the web at all. When a web search was triggered, each prompt generated 2–4 fan-out queries.
To pull Google’s SERPs for the fan-out queries, we used SerpAPI. For Bing, we ran the searches manually because the API returned unreliable results, so the dataset for Bing is smaller.
Finally, we compared ChatGPT’s cited sources against Google and Bing SERPs at the prompt level, by category, and by domain match (vs exact URL match).
We also analyzed the sources ChatGPT cited that didn’t appear in Google or Bing SERPs to see what we could learn about where those citations come from — and your chances of being one of them.
The Overlap Between ChatGPT Sources and Google’s SERP Varies Wildly from Prompt to Prompt
20 of the prompts had zero cited sources found in the fan-out SERPs — despite the fact that ChatGPT chose to search the web. Only one prompt out of the 100 we tested had 100% overlap.

A logical follow up question is, are we seeing this wide range of overlap between cited sources and the SERP because ChatGPT relies more heavily on search for some types of prompts vs others (where its training data is enough)?
So we also grouped them by business type:

While there is some variability, there’s less variation across categories than you might expect. Physical products had the lowest overlap, which is what we’d expect since ChatGPT is likely tapping into features like Google Shopping that weren’t considered in this study.
B2C SaaS also has a weaker overlap between cited sources and the SERP, but, again, this is to be expected because these types of products are typically pulled from app stores rather than their website directly.
This tells us that the variability comes more from how it weighs search results vs training data rather than from what the user is looking to buy.
We Went 10 Pages Deep, But ChatGPT Doesn’t for Most of Its Sources
Of the sources in our study that did appear in Google’s results for the fan-out queries, page 1 alone accounted for a third of all matches. The top three pages of the SERP contained ~66% of the matches, with a significant decline after that.

This shows what SEOs (including us) have observed and frankly, wished for, which is that ranking higher improves your chances of getting cited.
But the next important question is if ChatGPT is only pulling from the live web about 40% of the time on average, where is it getting the rest of the sources?
Where Does ChatGPT Get the Other 60% of Sources It Cites?
At the end of the day, where ChatGPT is pulling its sources from is a black box. But we do have a few theories based on what we observed in the data.
ChatGPT May be Running Additional Queries Behind the Scenes
We can only see the fan-out queries that run in our browser. It’s likely, however, that ChatGPT is running additional searches on OpenAI’s servers that we never see.
As QueryBurst has documented, it would make sense for OpenAI to distribute searches across multiple servers simultaneously — using different search engines, running different query variations, etc. If it did all of this sequentially in the browser, responses would be painfully slow.
Response Caching
ChatGPT may also be remembering or caching results from past users’ queries to save resources. OpenAI offers developers discounts if they used cache responses in their builds, so it makes sense that they would be using this approach too. If millions of users are asking similar buying-intent questions, it would be cheaper and faster to cache frequently retrieved sources rather than running fresh web searches every time.
Training Data
There’s evidence that ChatGPT retains specific details and facts verbatim from its training data. Research from NeurIPS has demonstrated that large language models can memorize and reproduce training data verbatim. If ChatGPT can remember exact text snippets, it’s conceivable it could also remember exact URLs.
In fact, we found several cases in our data where ChatGPT cited a real, live URL but gave it an outdated title — one that no longer matches the current page. For example:

In another example, ChatGPT cited the same exact URL twice but with different titles, in the same source list.
Here are the two titles ChatGPT showed:
The best video conferencing software in 2026 | Zapier
The best video conferencing software in 2025 | Zapier
And, here’s the title that appeared in Google’s SERP for the fan-out query which is also the actual title of the article as of this writing:
The best video conferencing software for teams in 2026
One explanation for both cases is that ChatGPT is pulling the URL and its old title from training data, not from a live web search. In the second example, one explanation for why it appeared twice is that it may have pulled the URL from its training data and from a live web search.
Caching doesn’t explain this either, since OpenAI’s own documentation shows caches typically last 5–10 minutes, and sometimes up to an hour — not months or years, which is the gap between these outdated titles and the current ones.
Hallucinations
In our dataset, we found roughly 10% of citations led to error pages. Recent academic research puts the number higher. A February 2026 study found hallucination rates above 14% for cited sources, and an earlier 2023 study published in Nature documented similar patterns. These are citations that look real — plausible URLs with plausible titles — but point to pages that don’t exist.
So where does all of this leave us?
GEO ≠ SEO…
The core takeaway is that you can’t use the same mindset or strategies for GEO or AIO that you do in traditional SEO. In SEO, we’re used to the repeatable and predictable idea of “you write a relevant, helpful, quality article targeting keyword X, get your on page SEO right, and build backlinks, and rank for X”. But, LLM answers are way too unpredictable for this same approach to be useful.
SparkToro’s recent research reinforces this. They found that the brand names that LLMs recommend are wildly inconsistent. Different users can enter the same exact prompt and not only get a different list of recommendations, but also get those recommendations in different orders.
Add to that the fact that in AI tools, unlike traditional Google, users ask essentially the same question in wildly different ways and the LLMs factor in a ton of deeply personal context with every users’ prompt, and you see how GEO is less controllable and predictable than SEO.
On top of that, you get different fan-out queries for the same exact prompt. The LLM won’t always search the web even for the same prompts. If it does search the web, the amount it relies on that web search versus training data can differ wildly. Also, the model’s capabilities change based on whether the user is paying or using a free account. I could go on, but you get the point.
And, all of that unpredictability only accounts for roughly 40% of the cited sources.
That is why a content strategy built around trying to rank for specific queries in order to be cited is impractical and ultimately will be ineffective.
… But SEO is Necessary for GEO
While the nebulous nature of LLMs is uncomfortable for most marketers and business owners, there are patterns to how LLMs operate and actions you can take to get mentioned more often by LLMs.
Be Available for Training Data
OpenAI admits to using web crawlers both for live web search and for training data, which means you have a chance of getting into ChatGPT’s training data just by putting content on your website. (More on what this content should be below. Hint: It’s not just “easy to read content with digestible chunks”.)
Increase Your Topical Authority
In the SparkToro piece we mentioned earlier, they mention that certain brands are mentioned repeatedly across responses.The top 3 most mentioned brands were cited 64% of the time for the same prompt on ChatGPT.
And our own data suggests ChatGPT may be recognizing that certain domains have more topical authority than others.
If we count all sources that had some page on their domain rank for a fan-out query — instead of requiring the exact URL to match — the percentage of cited sources appearing in Google’s SERPs jumps from 27% to about 50%.

This suggests that there’s some correlation between what ChatGPT and Google consider to be an authoritative site. And, ChatGPT may be favoring domains it recognizes as topically authoritative, even when it doesn’t cite the exact page that ranks. Your domain’s broader presence across a topic may matter more than whether one specific page ranks for one specific query.
One of the most effective ways to build that kind of topical authority is through SEO. Not the “optimize your meta tags and add FAQ schema” kind of SEO, but the kind where you’re consistently publishing detailed, substantive content that addresses the real pain points your customers have. The more comprehensively your domain covers a topic area, the more likely ChatGPT is to recognize you as a relevant source when someone asks about that topic, regardless of which specific fan-out query it runs.
78% of Cited Sources Were Typical SEO-Style Content, Whether or Not They Ranked
One of the big questions floating around marketing circles right now in various forms is whether or not content type needs to change. Do LLM’s even cite traditional marketing or SEO style blog posts such as list posts? And, important for this study, if you produce a traditional style SEO post like a list post but don’t rank yet for your target keyword (or related keywords), do you still have a shot at being cited by ChatGPT for typically similar prompts? Our data says, yes.
For example, here’s a post by Calendly that was ranking for the fanout query “tools for scheduling meetings with clients without back and forth emails scheduling tool”.

It looks very much like a typical SEO-style post that we would write for that keyword. And they put themselves first on the list, which we always recommend.
In fact, 78% of all cited sources for these high buying intent keywords were listicles, product pages, homepages, pricing pages, etc — essentially webpages selling your products/services, i.e., typical marketing or SEO content.

While this makes sense for the 40% of sources that were in the fan-out query SERPs, what may surprise some marketers is that it was also true of the 60% of sources not found in Google or Bing’s SERPs. The number only went down slightly to ~74%.
Said another way, even the majority of content cited by ChatGPT that were not ranking for fan-out queries looked and felt like content you would publish to rank for those queries: list posts, landing pages, etc.
This is aligned with the theories discussed above of where else ChatGPT could get these citations from, if not from the fan-out queries: caching past searches, or finding similar content by doing web searches when it was last trained.
This reinforces what we’ve been saying: the best way to influence LLM responses is to produce bottom-of-funnel, product centric content aimed to rank for similar queries on traditional google search.
What we can add to that understanding with this data is that you don’t need to target and rank for fan-out queries that you can’t even truly find. Instead, creating typical SEO content is the best way to be available for training data and build topical authority, which, again, are your best bets for showing up in LLM answers.
So, focus on building domain-level topical authority over time. What we recommend and do for clients is to identify your strengths and find Google keywords that map to those strengths. Then, write helpful, relevant, detailed content for those keywords.
That’s an actual content strategy.
How this Looks for a Real Client, Toro TMS
Toro TMS is a good example of how publishing content builds AI search visibility. They’re a relatively young brand that hadn’t done much content marketing before we started working with them. Once we began producing product-centric content targeting relevant Google keywords, their ChatGPT visibility followed.



They published thorough, detailed content about their product and the problems it solves, targeted at real Google keywords where their product is a genuine answer. The AI visibility came as a byproduct of high-quality content marketing.