{"id":144,"date":"2026-06-11T06:54:01","date_gmt":"2026-06-11T06:54:01","guid":{"rendered":"https:\/\/maxaeo.ai\/blog\/?p=144"},"modified":"2026-06-11T07:18:38","modified_gmt":"2026-06-11T07:18:38","slug":"sources-chatgpt-cites","status":"publish","type":"post","link":"https:\/\/maxaeo.ai\/blog\/sources-chatgpt-cites\/","title":{"rendered":"What Sources Does ChatGPT Cite? Data From 184,212 Citations"},"content":{"rendered":"<p>What sources does ChatGPT cite when it answers a buying question \u2014 and are they the same ones Perplexity and Gemini lean on? No. Each platform pulls from a measurably different mix of editorial sites, vendor docs, Reddit threads, review platforms and reference pages. Optimize for the wrong mix and you stay invisible on the platform your buyers actually use.<\/p>\n<p>(Looking for how to cite ChatGPT in an academic paper? Different question \u2014 this page is a data study of the citations AI answers themselves contain.) We analyzed <strong>184,212 citations from 40,950 answer snapshots<\/strong> collected by MaxAEO&#39;s citation tracing between March 1 and May 31, 2026, and broke down exactly which sources ChatGPT, Perplexity and Gemini cite \u2014 by domain, by source type, and by query intent.<\/p>\n<h2>The short answer: what ChatGPT cites most<\/h2>\n<p><strong>ChatGPT cites editorial and news sites most \u2014 24% of citations in our dataset \u2014 followed by vendor-owned docs and blogs (21%), reference sites like Wikipedia (13%), review platforms (12%) and community forums such as Reddit (11%).<\/strong> No single domain dominates: the long tail does most of the work.<\/p>\n<p>That last point matters more than any top-10 list. In our data, the ten most-cited domains account for only <strong>14% of all ChatGPT citations<\/strong>. <a href=\"https:\/\/www.tryprofound.com\/blog\/chatgpt-citation-sources\" target=\"_blank\" rel=\"noopener\">Profound&#39;s study of ~730,000 cited ChatGPT conversations<\/a> (October\u2013December 2025) found nearly the same thing: the top 10 domains captured just 12% of citations, with Wikipedia \u2014 the single biggest domain \u2014 at only about 5%, even though it appeared in 18% of cited conversations.<\/p>\n<h3>The 10 domains ChatGPT cites most<\/h3>\n<p>From our 2026 B2B-weighted dataset, ChatGPT&#39;s most-cited individual domains:<\/p>\n<table>\n<thead>\n<tr>\n<th>Rank<\/th>\n<th>Domain<\/th>\n<th>Share of all ChatGPT citations<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<td>en.wikipedia.org<\/td>\n<td>4.8%<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>reddit.com<\/td>\n<td>2.1%<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>g2.com<\/td>\n<td>1.6%<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>forbes.com<\/td>\n<td>1.3%<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>youtube.com<\/td>\n<td>1.1%<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>techradar.com<\/td>\n<td>0.8%<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>capterra.com<\/td>\n<td>0.7%<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>gartner.com<\/td>\n<td>0.6%<\/td>\n<\/tr>\n<tr>\n<td>9<\/td>\n<td>linkedin.com<\/td>\n<td>0.5%<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>medium.com<\/td>\n<td>0.5%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Together: 14% \u2014 meaning <strong>86% of ChatGPT&#39;s citations go to everyone else<\/strong>. So the practical question is not &quot;how do I get onto the three domains ChatGPT loves.&quot; It is &quot;which <strong>source types<\/strong> does each platform trust for my category, and am I present on them?&quot; That is what the rest of this study answers.<\/p>\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" style=\"max-width:100%;height:auto\" loading=\"lazy\"  src=\"https:\/\/maxaeo.ai\/blog\/wp-content\/uploads\/2026\/06\/1781104689129-5-89134-1-1.png\" alt=\"Chart of what sources ChatGPT cites most compared with Perplexity and Gemini, broken down by source type from MaxAEO citation tracing data\"><\/figure>\n<h2>How we measured it: MaxAEO&#39;s citation tracing methodology<\/h2>\n<p>MaxAEO is an <a href=\"\/ai-visibility-metrics\">AI visibility tool<\/a> that runs tracked prompts against ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok and Google&#39;s AI surfaces every day, recording which brands get mentioned and which URLs get cited. For this study we isolated one slice:<\/p>\n<ul>\n<li><strong>Prompt set:<\/strong> 1,050 prompts across 14 B2B software categories (CRM, support, analytics, security, HR tech and others), mixing informational (&quot;what is\u2026&quot;, &quot;how does\u2026&quot;) and commercial (&quot;best\u2026&quot;, &quot;X vs Y&quot;, &quot;alternatives to\u2026&quot;) intent.<\/li>\n<li><strong>Platforms:<\/strong> ChatGPT (search enabled), Perplexity, and Gemini (with grounding), each prompt run weekly.<\/li>\n<li><strong>Window:<\/strong> March 1 \u2013 May 31, 2026 \u2014 40,950 answer snapshots, 184,212 extracted citations.<\/li>\n<li><strong>Classification:<\/strong> every cited URL bucketed into one of eight source types by domain rules plus manual review of ambiguous domains.<\/li>\n<\/ul>\n<p>One honest caveat: this is a <strong>B2B-software-weighted prompt set<\/strong>. Consumer queries about health, travel or news produce a different mix \u2014 Profound&#39;s consumer-population numbers above are the right reference point there. Percentages below are shares of all citations on each platform, not shares of answers.<\/p>\n<h2>ChatGPT vs. Perplexity vs. Gemini: the source-type mix<\/h2>\n<p>The same question gets sourced three different ways. The full breakdown from our 184,212 traced citations:<\/p>\n<table>\n<thead>\n<tr>\n<th>Source type<\/th>\n<th>ChatGPT<\/th>\n<th>Perplexity<\/th>\n<th>Gemini<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Editorial &amp; news (Forbes, TechCrunch, trade press)<\/td>\n<td><strong>24%<\/strong><\/td>\n<td>19%<\/td>\n<td>22%<\/td>\n<\/tr>\n<tr>\n<td>Vendor-owned (docs, product pages, company blogs)<\/td>\n<td>21%<\/td>\n<td>17%<\/td>\n<td><strong>26%<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Community &amp; forums (Reddit, Quora, Stack Overflow)<\/td>\n<td>11%<\/td>\n<td><strong>21%<\/strong><\/td>\n<td>9%<\/td>\n<\/tr>\n<tr>\n<td>Review &amp; comparison sites (G2, Capterra, Gartner)<\/td>\n<td>12%<\/td>\n<td>14%<\/td>\n<td>8%<\/td>\n<\/tr>\n<tr>\n<td>Reference (Wikipedia and other encyclopedic sites)<\/td>\n<td>13%<\/td>\n<td>7%<\/td>\n<td>6%<\/td>\n<\/tr>\n<tr>\n<td>Social &amp; video (LinkedIn, YouTube)<\/td>\n<td>8%<\/td>\n<td>6%<\/td>\n<td><strong>17%<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Academic &amp; government (.edu, .gov, research)<\/td>\n<td>5%<\/td>\n<td>11%<\/td>\n<td>5%<\/td>\n<\/tr>\n<tr>\n<td>Other \/ long tail<\/td>\n<td>6%<\/td>\n<td>5%<\/td>\n<td>7%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Three headline patterns:<\/p>\n<ol>\n<li><strong>ChatGPT is an editorial-and-reference engine.<\/strong> It over-indexes on journalism and Wikipedia relative to the other two.<\/li>\n<li><strong>Perplexity is the UGC engine.<\/strong> Community content takes roughly one citation in five \u2014 Reddit alone is its single biggest domain, consistent with <a href=\"https:\/\/www.semrush.com\/blog\/most-cited-domains-ai\/\" target=\"_blank\" rel=\"noopener\">Semrush&#39;s three-month, 100-million-citation study<\/a>, which found Reddit among the top sources on every platform but most concentrated on Perplexity.<\/li>\n<li><strong>Gemini trusts you to describe yourself.<\/strong> Vendor-owned content plus YouTube (a Google property) make up <strong>43% of its citations<\/strong> \u2014 the highest owned-plus-Google share of any platform.<\/li>\n<\/ol>\n<p>Citation slots differ too. Median distinct domains per answer: <strong>Perplexity 5, ChatGPT 4, Gemini 3<\/strong>. Fewer slots means each Gemini citation is harder to win.<\/p>\n<h2>What ChatGPT cites: editorial first, Wikipedia still over-weighted<\/h2>\n<p><strong>ChatGPT&#39;s mix rewards earned media.<\/strong> Editorial and news (24%) plus reference pages (13%) means more than a third of its sourcing is content you don&#39;t control \u2014 you earn your way in. On our commercial-intent prompts (&quot;best X for Y&quot;), an editorial roundup or trade-press comparison appeared in the citation set of <strong>71% of ChatGPT answers<\/strong>.<\/p>\n<p>Part of this tilt is contractual. OpenAI has signed content-licensing deals with the Associated Press, Axel Springer, the Financial Times, News Corp, Cond\u00e9 Nast, Hearst and Reddit, among others \u2014 licensed publishers flow into retrieval with cleaner access than the open web. News content punches above its weight partly because OpenAI pays for it.<\/p>\n<p>Two behaviors change what gets cited:<\/p>\n<ul>\n<li><strong>Citations cluster early in conversations.<\/strong> Profound found a first-turn message is about <strong>2.5\u00d7 more likely<\/strong> to trigger web search (and therefore citations) than a tenth-turn message. Your brand&#39;s first impression is decided in turn one.<\/li>\n<li><strong>The mix is volatile.<\/strong> Semrush documented a sharp drop in Reddit and Wikipedia citation rates on ChatGPT after a retrieval-pipeline change in mid-September 2025, with professional sources like Forbes, Medium and LinkedIn gaining share within weeks. Whatever mix you measured last quarter is not this quarter&#39;s mix.<\/li>\n<\/ul>\n<p>The lever this points to: if ChatGPT is your priority platform, <a href=\"\/digital-pr-ai-citations\">digital PR aimed at the publications AI already trusts<\/a> buys more visibility per dollar than another on-site blog post.<\/p>\n<h2>What Perplexity cites: the Reddit and community engine<\/h2>\n<p><strong>Perplexity is the platform where user-generated content decides who gets recommended.<\/strong> Community and forum citations hit 21% in our dataset \u2014 nearly double ChatGPT&#39;s share \u2014 and academic\/government sources (11%) run second-highest, reflecting Perplexity&#39;s own index, which ranks discussion threads and primary sources aggressively.<\/p>\n<p>The commercial-intent numbers are starker. On &quot;best\/vs\/alternatives&quot; prompts, review sites plus community threads together supplied <strong>41% of Perplexity&#39;s citations<\/strong>. A skeptical Reddit thread from 2024 can outrank your polished comparison page in Perplexity&#39;s sourcing for years.<\/p>\n<p>A worked example from May 14, 2026 \u2014 the prompt <em>&quot;best ticketing system for B2B SaaS support teams&quot;<\/em>, run the same hour on all three platforms:<\/p>\n<ul>\n<li><strong>Perplexity<\/strong> cited two Reddit threads, G2, Capterra and one vendor blog.<\/li>\n<li><strong>ChatGPT<\/strong> cited a Forbes Advisor roundup, G2, a vendor doc and one Reddit thread.<\/li>\n<li><strong>Gemini<\/strong> cited two vendor comparison pages, a YouTube review and one trade-press article.<\/li>\n<\/ul>\n<p>Same question, three different juries. If AI-driven recommendations matter to your pipeline, <a href=\"\/reddit-ai-recommendations\">Reddit is upstream infrastructure for AI answers<\/a> \u2014 not optional.<\/p>\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" style=\"max-width:100%;height:auto\" loading=\"lazy\"  src=\"https:\/\/maxaeo.ai\/blog\/wp-content\/uploads\/2026\/06\/1781104689129-5-89134-2-1.png\" alt=\"Side-by-side comparison of ChatGPT, Perplexity and Gemini answers citing different source types for the same B2B software prompt\"><\/figure>\n<h2>What Gemini cites: Google&#39;s ecosystem and your own site<\/h2>\n<p><strong>Gemini gives the highest weight to content brands control: vendor docs, product pages and company blogs make up 26% of its citations, and YouTube pushes the social-and-video bucket to 17%.<\/strong> Both numbers lead the three platforms, while review sites (8%) and Wikipedia (6%) trail.<\/p>\n<p>The most actionable stat in our dataset sits here. On <strong>brand-direct prompts<\/strong> (&quot;what is [brand]&quot;, &quot;[brand] pricing&quot;, &quot;is [brand] secure&quot;), Gemini cited the brand&#39;s own domain in <strong>61% of answers<\/strong> \u2014 versus 38% for ChatGPT. If your documentation is thin, outdated or blocked from crawling, Gemini fills the gap with whatever third-party content exists, accurate or not. That makes documentation quality an <a href=\"\/ai-reputation-management\">AI reputation management<\/a> issue, not just a support issue.<\/p>\n<p>Gemini&#39;s behavior also correlates strongly with classic Google standing: pages ranking in the top organic results get cited far more often \u2014 the same pattern we document for <a href=\"\/google-ai-overviews-visibility\">Google AI Overviews inclusion<\/a>. Semrush adds a related signal from Google&#39;s AI Mode: LinkedIn appeared in roughly 15% of responses there, while Wikipedia fell to about 2%. Google&#39;s surfaces simply don&#39;t need an encyclopedia layer the way ChatGPT does.<\/p>\n<h2>Why the same question gets different sources<\/h2>\n<p>The differences are architectural, not random. Each platform retrieves from a different index with different ranking incentives:<\/p>\n<ul>\n<li><strong>ChatGPT<\/strong> retrieves through Bing&#39;s index plus OpenAI&#39;s own crawler (OAI-SearchBot) and the licensing layer above \u2014 which is why news and reference content punch above their weight.<\/li>\n<li><strong>Perplexity<\/strong> built its own retrieval index and explicitly boosts discussions and primary sources \u2014 which is why Reddit and .gov content surface so often.<\/li>\n<li><strong>Gemini<\/strong> grounds against Google Search, inheriting Google&#39;s ranking judgments and its preference for YouTube and authoritative first-party pages.<\/li>\n<\/ul>\n<p>One distinction worth keeping straight: <strong>training data is not citations<\/strong>. What a model &quot;knows&quot; from pre-training shapes unsourced answers; citations only appear when the platform retrieves live web pages at answer time. You can influence retrieval this quarter \u2014 training data, only over years.<\/p>\n<p>The consequence for marketers: <strong>&quot;AI visibility&quot; is not one channel.<\/strong> It is three or more retrieval systems with different source diets. A strategy that only feeds one diet \u2014 say, blog posts on your own domain \u2014 can win Gemini while staying invisible on ChatGPT and Perplexity. Answer engine optimization starts with knowing which diet your target platform eats.<\/p>\n<h2>Why public studies disagree (and how to read them)<\/h2>\n<p>If you&#39;ve seen claims that &quot;Wikipedia is 47% of ChatGPT citations&quot; and also &quot;Wikipedia is 5%,&quot; neither is necessarily wrong \u2014 they measure different things. Before taking any number into a budget meeting, check three things:<\/p>\n<ol>\n<li><strong>The metric.<\/strong> <em>Share of all citations<\/em> (our method, and Profound&#39;s) produces small-looking numbers because of the long tail. <em>Share of responses citing a domain at least once<\/em> (Semrush&#39;s weekly view) produces big numbers \u2014 Reddit touched ~60% of ChatGPT responses in early August 2025 by that measure. Same reality, different denominator.<\/li>\n<li><strong>The prompt set.<\/strong> Consumer health prompts pull .gov and medical sources; B2B software prompts pull G2 and trade press. A study&#39;s mix reflects its questions.<\/li>\n<li><strong>The date.<\/strong> The September 2025 retrieval shift moved domain-level numbers by tens of points within weeks. Citation data has a shelf life of about a quarter.<\/li>\n<\/ol>\n<p>This is why one-off screenshots are weak evidence for <a href=\"\/ai-visibility-metrics\">LLM brand tracking<\/a>: you need the same prompts, re-run on a schedule, with the metric definition held constant.<\/p>\n<h2>What this means for your AI visibility strategy<\/h2>\n<p>Map effort to each platform&#39;s actual source diet instead of spreading evenly:<\/p>\n<ol>\n<li><strong>Audit where you stand today.<\/strong> Run your 20 highest-intent category prompts on ChatGPT, Perplexity and Gemini. Log every cited URL and bucket it by the taxonomy above. The gaps tell you which source type to fix first.<\/li>\n<li><strong>For ChatGPT: earn editorial citations.<\/strong> Pitch data studies and expert commentary to the trade publications already appearing in your category&#39;s answers. Reference-style content \u2014 clear definitions, comparison tables \u2014 also travels well.<\/li>\n<li><strong>For Perplexity: invest in community presence.<\/strong> Founder-level participation in relevant subreddits, honest answers on Stack Overflow and Quora, and review velocity on G2\/Capterra move citation share faster than anything on your own domain.<\/li>\n<li><strong>For Gemini: harden your owned content.<\/strong> Complete, crawlable docs, a maintained pricing page, structured data and YouTube product content cover 43% of its citation diet.<\/li>\n<li><strong>Re-measure monthly and report AI share of voice, not anecdotes.<\/strong> Citation mixes shift; September 2025 proved they can shift fast.<\/li>\n<\/ol>\n<p>Generative engine optimization, done this way, stops being a hype line and becomes an allocation decision: which source types, in which order, for which platform.<\/p>\n<h2>How to track which sources AI cites for your brand<\/h2>\n<p>The manual version: fixed prompt list, weekly runs, paste every citation into a spreadsheet, classify by type. It works \u2014 for about two weeks, one platform and one product category. Past that, daily sampling across six platforms is what reveals the volatility that actually changes decisions.<\/p>\n<p>That continuous version is what MaxAEO automates: it monitors brand mentions in ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok and Google&#39;s AI surfaces daily, traces every citation behind those answers, and tells you which specific source gap \u2014 a missing G2 profile, a stale Reddit thread, thin docs \u2014 is suppressing your recommendations on each platform. The fix list, not just the dashboard, is the point: that&#39;s how you <a href=\"\/digital-pr-ai-citations\">get recommended by ChatGPT<\/a> more often instead of just watching a score move.<\/p>\n<h2>Frequently asked questions<\/h2>\n<h3>Does ChatGPT cite sources in every answer?<\/h3>\n<p>No. ChatGPT only cites sources when it decides to search the web (or when search is forced). Profound&#39;s data shows search triggers most often on a conversation&#39;s first message \u2014 about 2.5\u00d7 more likely than by turn ten. Pure model-memory answers carry no citations at all, which is why brand perception there depends on training data, not links.<\/p>\n<h3>How do I see which sources ChatGPT used in an answer?<\/h3>\n<p>Search-enabled answers show inline link icons after the sentences they support, plus a sources panel listing every cited page at the end of the answer. Click any chip to open the underlying URL. If an answer shows no links, ChatGPT answered from model memory and used no retrievable sources.<\/p>\n<h3>Are the sources ChatGPT cites always real?<\/h3>\n<p>Citations in search-enabled answers are real, clickable URLs retrieved at answer time. The notorious fabricated references come from non-browsing mode, when users ask the model to &quot;list sources&quot; and it generates plausible-looking but sometimes nonexistent ones. For brand monitoring, only retrieval-backed citations are worth tracking.<\/p>\n<h3>Which platform is easiest to get cited on?<\/h3>\n<p>Usually Gemini for your own branded queries (61% own-domain citation rate in our data, if your docs are solid) and Perplexity for category queries, because community content and fresh primary sources can enter its index within days. ChatGPT&#39;s editorial-heavy mix takes longest \u2014 earned media has lead time.<\/p>\n<h3>How often does the AI citation mix change?<\/h3>\n<p>Treat any snapshot as valid for roughly one quarter. Semrush&#39;s 13-week study captured Reddit&#39;s share of ChatGPT responses falling from about 60% to about 10% within weeks of a single retrieval-pipeline change in September 2025. Continuous AI search monitoring exists precisely because these shifts arrive unannounced.<\/p>\n<h3>Does llms.txt help me get cited?<\/h3>\n<p>The evidence so far is thin \u2014 major platforms haven&#39;t confirmed honoring it, and we see no citation lift attributable to it in our tracking. Basics matter more: don&#39;t block <a href=\"https:\/\/platform.openai.com\/docs\/bots\" target=\"_blank\" rel=\"noopener\">OAI-SearchBot<\/a>, PerplexityBot or Google-Extended in robots.txt, keep docs crawlable, and build presence on the source types each platform already trusts.<\/p>\n<hr>\n<blockquote>\n<p>This article was created with AI assistance from original MaxAEO tracking data and reviewed by a human editor.<\/p>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>What sources does ChatGPT cite? We traced 184,212 citations across ChatGPT, Perplexity and Gemini to rank the domains and source types each trusts \u2014 and what to fix first.<\/p>\n","protected":false},"author":1,"featured_media":201,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-144","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/posts\/144","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/comments?post=144"}],"version-history":[{"count":2,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/posts\/144\/revisions"}],"predecessor-version":[{"id":249,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/posts\/144\/revisions\/249"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/media\/201"}],"wp:attachment":[{"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/media?parent=144"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/categories?post=144"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/maxaeo.ai\/blog\/wp-json\/wp\/v2\/tags?post=144"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}