How to Produce Content Optimised for LLM Citation

If you've read what the data says about how people use ChatGPT, you know that nearly 80% of usage falls into three buckets: Practical Guidance, Seeking Information, and Writing. Users are either asking for facts, asking for advice they can adapt, or asking for help producing or refining text. When ChatGPT (or similar systems) answer those queries, they draw on training data and, when available, web search and external retrieval. Content that fits how people actually ask—and that is easy for models to parse and reuse—is more likely to show up in answers and get cited.

This post turns those findings into concrete recommendations for producing content that is optimised for LLM retrieval and citation.

1. Prioritise the two use cases where "sources" matter most

The NBER paper separates Seeking Information (factual lookups: "same for all users") from Practical Guidance (customised advice: tutoring, how-to, ideation). Both drive a large share of queries; both are where your content can be used as a source.

Seeking Information (~14–24% of messages and growing)
People ask for facts about people, events, products, recipes, definitions, and current affairs. The paper describes this as "a very close substitute for web search." When ChatGPT has search or retrieval, it's this kind of query that pulls in and cites external pages.
Implication: Invest in reference-style content: clear, factual, canonical answers to questions that many people ask the same way.
Practical Guidance (~29% of messages)
How-to advice, tutoring, creative ideation, health/fitness/beauty. The model often synthesises multiple sources into custom advice. Content that states principles, steps, or frameworks in a reusable way is more likely to be reflected in those answers (and, where systems support it, cited).
Implication: Produce structured guidance: how-tos, tutorials, checklists, and decision frameworks that can be summarised or adapted.

Recommendation: Focus your "citation optimisation" efforts on (a) factual, search-style content and (b) practical guidance (how-to, teaching, frameworks). These align with the two biggest non-writing use cases in the data.

2. Write for the way people ask: direct questions and clear intents

Roughly 49% of messages are Asking—users want information or advice to inform a decision. Asking is growing faster than Doing and gets higher satisfaction. So a lot of value comes from answering questions and supporting decisions.

Front-load answers. Put the direct answer or definition near the top (e.g. in the first paragraph or under a clear subheading). Models (and users) tend to latch onto the first coherent, on-topic block.
Use question-shaped headings. Headings that mirror real queries ("What is X?", "How do I…?", "When did…?") match how people phrase prompts and improve the chance your section is retrieved for that intent.
Provide decision support. Comparisons, criteria, pros/cons, and "when to use X vs Y" are exactly the kind of content that supports Asking. Structure them so they can be extracted as lists or short paragraphs.

Recommendation: Structure each piece around one or more explicit questions, with concise, quotable answers early in the section. This matches Asking-dominated usage and makes your content easier to cite. Tools like Obsurfable support this directly: you define the Prompts (questions you want AI to answer with your content), run retrieval to see how ChatGPT or similar models actually answer them and whether they mention or cite you, then use Insights to get gaps and concrete recommendations—so you're optimising for the questions people really ask.

3. Make factual content "one right answer" friendly

The paper defines Seeking Information as "factual information that should be the same for all users" (e.g. Boston Marathon qualifying times by age and gender). Content that states facts unambiguously and in one place is easier for retrieval systems to surface and for the model to attribute.

One canonical formulation per concept or fact. Avoid scattering the same fact across many pages or wording it differently each time. Prefer a single "source of truth" page or section per topic.
Use lists, tables, and definitions. Structured data (dates, numbers, criteria, steps) is easier for models to extract and quote. Tables and numbered lists also improve scannability for both humans and systems.
Name entities and events clearly. Proper nouns, standardised terms, and consistent phrasing ("X is…", "Y refers to…") help retrieval match user queries and help the model point back to you.

Recommendation: For reference and factual content, aim for one clear, structured answer per question, with consistent terminology and machine-friendly formatting (headings, lists, tables). In Obsurfable, Queries (query fan-out) let you map the sub-questions that matter for a topic, attach URLs to each, and verify whether that page actually satisfies the question—so you can see where you already have a single strong answer and where you're scattered or missing.

4. Support "Writing" use cases with reusable patterns and examples

Writing is the single largest work use (~40% of work-related messages). About two-thirds of Writing messages ask the model to modify user text (edit, critique, translate, summarize) rather than create from scratch. So people use ChatGPT to improve or adapt something they already have.

Provide templates, examples, and principles. Content that describes formats, tone, structure, or "how to write X" gives the model patterns to suggest. When the model helps with emails, reports, or summaries, it's drawing on such patterns—often from training or retrieval.
Explain conventions and criteria. "What makes a good X", "how to structure a Y", "common mistakes in Z" are decision-support content that feeds into both Asking and Doing (e.g. "improve this email").

Recommendation: If your audience is professionals or students, create writing guidance: templates, before/after examples, and clear criteria. That aligns with the dominant work use and increases the chance your conventions and examples shape (and, where possible, get cited in) writing-related answers.

5. Align topics with high-volume and high-satisfaction use

The data shows where usage and satisfaction are concentrated:

Practical Guidance and Seeking Information together account for the majority of non-writing usage; Asking is growing and rated higher quality.
Tutoring/teaching is a major slice of Practical Guidance (~10% of all messages).
Work use is heavily about obtaining/documenting/interpreting information and making decisions, giving advice, solving problems, and thinking creatively.

So content that serves learning, decision-making, and professional tasks (especially information and writing) is well aligned with how people use ChatGPT.

Recommendation: Prioritise topics that support education (explanations, tutorials, concept breakdowns), decision support (comparisons, criteria, frameworks), and professional writing/information (how-tos, reference, templates). These match the highest-volume and highest-satisfaction intents.

6. Optimise for retrieval as well as "reading"

When ChatGPT uses search or RAG, it's matching user queries to documents or passages. Content that is discoverable and parseable is more likely to be retrieved and then cited.

Semantic clarity. Use full sentences and clear paragraphs; avoid jargon without explanation. Models (and search) do better when the meaning of a passage is self-contained.
Stable, descriptive structure. Consistent heading hierarchy (H1 → H2 → H3) and predictable sections (e.g. "Overview", "Steps", "Examples") help retrieval systems and models identify "this block answers that question."
Technical basics. Fast loading, crawlable URLs, and valid markup improve the chance your content is indexed and available to retrieval. These are the same basics that help traditional search.

Recommendation: Treat "LLM citation" as an extension of search and structure: write for clarity and intent, use consistent structure, and keep technical SEO and crawlability in order so your content can be retrieved when users ask. Obsurfable ties this together: connect your site (and optional sitemap or llms.txt), run Site analysis to see how your pages line up for AI consumption—topic coverage, thin clusters, retrieval readiness—and use Prompts and retrieval to check whether you're actually being cited for the questions that matter. Insights then turn those results into actionable recommendations.

7. Be the canonical source for a narrow slice

The paper shows that Getting Information, Documenting/Recording Information, and Making Decisions and Solving Problems are among the top activities across many occupations. So there's broad demand for authoritative, well-structured information in many domains.

Own a clear niche. One definitive guide, one well-maintained reference, or one clearly explained framework is more useful than many shallow pages. Models and retrieval prefer a single strong match over many weak ones.
Update and maintain. Stale or contradictory content is harder to trust and cite. Regular updates and a clear "last updated" or version signal help both users and systems.

Recommendation: Aim to be the (or a) go-to source for a specific set of questions or tasks. Depth and accuracy in a narrow area beat breadth and vagueness for retrieval and citation. To close the loop, use Insights (from a platform like Obsurfable) to turn retrieval gaps and strengths into concrete next steps—and, where useful, generate or refine Content from those recommendations so you're publishing in line with what actually moves the needle for citation.

Summary: a short checklist

Priority	Action
Use case	Focus on Seeking Information (factual reference) and Practical Guidance (how-to, teaching, frameworks).
Intent	Write for Asking: direct answers, question-shaped headings, decision support (comparisons, criteria, pros/cons).
Factual content	One clear answer per question; lists, tables, definitions; consistent terminology.
Writing support	Provide templates, examples, and criteria for professional and educational writing.
Topics	Emphasise education, decision support, and professional information/writing.
Structure	Front-load answers; clear headings and hierarchy; semantic clarity; crawlable, indexable pages.
Authority	Be the canonical source for a defined slice; keep content updated and consistent.

The goal isn't to "game" algorithms—it's to produce content that matches how hundreds of millions of people already use ChatGPT: asking for facts, asking for advice, and asking for help with writing and decisions. Content that answers those intents clearly, in a form that retrieval systems and LLMs can find and reuse, is content that is more likely to be cited when it matters.