How do LLMs decide what to recommend?

The most misleading thing about the current conversation around AI search is how mystical it sounds. People talk about "training data" and "model intuition" as if recommendation inside tools like ChatGPT is some unknowable black box. In practice, it is far more mechanical than that. If you understand how answers are assembled, you can influence whether your product appears in them. If you do not, you are left hoping the model somehow stumbles into you on its own.

For most commercially relevant questions, large language models do not rely on memory alone. When a user asks about tools, products, comparisons, or anything that smells like a decision, the system runs retrieval. It pulls in documents from the web, evaluates them for relevance and trust, and then synthesizes an answer that cites multiple sources. The model is not picking a winner. It is aggregating signals. That distinction changes everything.

Traditional SEO trained people to think in terms of ranking. One page, one keyword, one position. Answer engines do not work that way. They reward coverage and repetition across independent sources. A company that appears consistently across forums, reviews, videos, and articles will usually outrank a technically "better" page that exists in isolation. Frequency matters more than precision. Breadth matters more than polish.

This is why people looking for some secret AEO trick tend to get frustrated. There is no markup you add that suddenly unlocks ChatGPT distribution. The system is closer to a research assistant than a search engine. It asks, implicitly, "What does the internet seem to agree on?" Your job is to make sure your name shows up in that answer often enough that it becomes hard to ignore.

What complicates this is that answer engines do not behave deterministically. Ask the same question twice and you may get slightly different citations. Rephrase the prompt and the composition shifts again. This randomness leads people to chase ghosts. In reality, the variance smooths out when you look at enough prompts. Patterns emerge. Brands with strong citation density keep showing up. Everyone else flickers in and out.

Once you accept that premise, the work becomes much more concrete.

The first half of that work still looks like publishing, but the intent is different. Pages that perform well inside answer engines tend to be exhaustive rather than optimized. They do not exist to rank for a phrase. They exist to answer a cluster of questions thoroughly enough that the model does not need to go elsewhere mid answer. That means writing content that anticipates follow ups. Not in a checklist way, but in a "what would someone reasonably ask next?" way.

A page about an API is not complete because it explains what the API does. It is complete when it also explains who should not use it, what breaks at scale, how pricing behaves under load, what integrations matter in practice, and what alternatives are chosen when certain constraints exist. Models reward sources that reduce uncertainty. Thin pages force the model to stitch together ten citations. Thick pages become anchors.

What surprises most teams is how quickly this can work. You do not need years of authority for a page to be retrieved. Retrieval systems are biased toward relevance and clarity, not domain age. A newly published article that cleanly answers a specific, awkward, long tail question can start appearing in answers within days. This is one of the few areas of modern distribution where being early and specific beats being large.

The second half of the work happens off your site, and this is where most AEO efforts quietly succeed or fail.

Answer engines lean heavily on places where people speak freely and contradict each other. Reddit, YouTube comments, niche forums, review sites. These sources are messy, but that messiness is exactly why they are trusted. A model that sees the same product mentioned by a founder on Reddit, a customer in a forum thread, and a reviewer on YouTube gains confidence that the product actually exists in the world.

Trying to game this layer usually backfires. Astroturfing reads like astroturfing, even to machines. What works is boring and time consuming. Show up where your users already ask questions. Be explicit about who you are. Answer honestly, including when your product is not the right fit. Counterintuitively, those admissions strengthen retrieval because they align with how humans talk when they are not selling.

Video deserves special mention here. Long tail B2B video is still underproduced relative to demand, and answer engines cite it aggressively. A ten minute walkthrough that explains a narrow use case will often outperform a polished blog post competing with a hundred near clones. The model does not care about production value. It cares about whether the content resolves the question it is trying to answer.

Measurement is where many teams lose the plot. There is no stable notion of rank inside an answer engine. Looking for one leads to false conclusions. The metric that matters is share of voice across a defined set of prompts. How often you appear. Where you tend to appear within the answer. Whether you show up consistently across platforms or only in one environment.

The only way to know whether something works is to run controlled experiments. Pick a set of questions. Leave half untouched. Actively intervene on the other half by publishing, participating, or securing mentions. Track both over time. Most advice circulating right now has never been tested this way. It sounds plausible and feels modern, but collapses under comparison.

One of the more uncomfortable implications of all this is that attribution gets fuzzy. Users may never click. They may see your name, trust it, and search you directly later. Or they may sign up days afterward with no obvious referral path. If you are expecting clean dashboards, you will be disappointed. If you are willing to run surveys and look at directional lift, the signal is there.

The payoff, when it comes, is not incremental. Traffic from answer engines tends to convert unusually well because the persuasion already happened upstream. The model did the comparison. The user arrives to confirm, not to explore. That is a very different starting point than a cold search click.

None of this replaces SEO. It sits alongside it. The same fundamentals still apply. Clear writing. Real specificity. A willingness to commit to claims rather than hedging everything into mush. What has changed is how those signals are aggregated and presented.

Answer engines reward people who say useful things in public, repeatedly, across places that real humans already trust. That is not a growth hack. It is a distribution shift. And it favors teams who understand that being cited is now just as important as being ranked.

If you internalize that, the rest of AEO stops feeling mysterious and starts feeling like work you can actually do.