Section 5 of 5 · 15 min read

Verification

All the prompting skill in the world doesn't make AI reliable on facts. Verification is not an optional step — it's the discipline that makes everything else safe to use in serious work.

Why verification is non-negotiable

LLMs generate plausible text. That's what they're built to do. They are not built to verify claims against reality — they are built to produce coherent, contextually appropriate completions. The result: they will state incorrect figures with confidence, cite papers that don't exist, fabricate quotes from real people, and get policy details wrong while sounding authoritative.

The problem is not that AI gets things wrong — humans do too. The problem is that AI errors are indistinguishable from correct outputs. There's no hesitation, no flagging of uncertainty, no visible strain. A hallucinated statistic and a correct one look identical. In climate communication, policy analysis, grant writing, or any context where a false claim does real damage, this is an operational risk that cannot be managed by trusting the model.

Better prompting reduces hallucination risk on well-defined tasks. Asking the model to flag uncertainty helps. Using reasoning modes helps. None of it eliminates the problem. Verification is the only reliable backstop — and it is always the human's responsibility.

AI can analyze documents you provide. It cannot reliably retrieve documents for analysis. Use it as the analyst; be the librarian yourself.

How hallucinations happen

A hallucination is not a glitch or a lie. It's the model doing exactly what it's designed to do — predicting the most plausible next token — in a situation where the plausible answer happens to be wrong. Several specific patterns are worth knowing.

Knowledge cutoff gaps

Models are trained on data up to a cutoff date. Anything after that — recent policy changes, updated emissions figures, new research findings — either doesn't exist in the model's knowledge or gets interpolated from outdated patterns. Climate data moves fast. Always verify statistics against primary sources.

Citation fabrication

When asked to cite sources, models will sometimes produce plausible-sounding but non-existent citations: real journal names, plausible author names, years that fit, titles that sound right. The paper doesn't exist. Check every citation before using it in any document that others will rely on.

Number interpolation

Statistical claims are high-risk. When a model doesn't have a specific figure, it often produces one that fits the context and scale of surrounding figures. The number will usually be in the right ballpark. It will sometimes be wrong in ways that matter. Treat every specific number as unverified until you've checked the primary source.

Confident extrapolation

Models are trained to sound confident. Uncertainty rarely surfaces unless you explicitly prompt for it. An AI discussing the projected cost of a 2°C overshoot scenario for sub-Saharan agriculture may be synthesizing real research well, or it may be plausibly extrapolating from adjacent data. You can't tell from the output's tone alone.

A practical verification workflow

Verification doesn't mean fact-checking every sentence. It means applying scrutiny proportional to the stakes and applying it systematically to the categories most likely to fail. Here is a practical approach for climate professionals:

Flag claims by type

Mentally sort AI output into: statistics/numbers (highest risk), citations (high risk), policy details (medium risk), general explanations (lower risk, but not zero). Apply verification effort proportionally.

Ask the AI to flag its own uncertainty

Add to your prompt: "Where you are uncertain or where information may have changed, flag it explicitly." This doesn't catch everything, but it surfaces the cases the model itself has low confidence on.

Verify numbers at primary source

For any specific statistic you plan to use, verify it at the original source — IPCC report, IEA database, peer-reviewed paper, official government statistics. Don't accept AI's source citation at face value; look it up directly.

Check citations exist

Before including any AI-generated citation in any document, search for it. If the paper doesn't exist, it doesn't exist. If the paper exists but says something different from what the AI attributed to it, that's equally problematic.

Use web search mode for current events

For anything time-sensitive, use a model with web search enabled rather than relying on training data. But note: web search reduces hallucination risk on recent facts; it doesn't eliminate it.

The Prompt Clinic: diagnosing what went wrong

Verification extends beyond fact-checking. It includes recognizing when a prompt is structurally broken — when the output is useless not because the AI hallucinated facts, but because the prompt set it up to fail. A prompt can fail in several distinct ways: it's too vague to produce a specific result, it lacks the context the AI needs, it has no role to anchor the response, it has no constraints to prevent sprawl, or it actively invites hallucination by asking the model to retrieve information it shouldn't be trusted to retrieve.

Prompt diagnosis is a learnable skill. Once you can look at a bad prompt and name what's wrong, you can fix it efficiently — rather than iterating blindly. The exercise below gives you practice on real broken climate prompts. Each one has a specific failure mode; your job is to name it and fix it.

Interactive exercise

The Prompt Clinic

Three broken climate AI prompts. For each: identify what's wrong, rewrite it, and get AI feedback on your fix.

Prompt 1 of 3

Situation

A policy analyst preparing a briefing for legislators on voluntary carbon market reform.

Broken prompt

“Tell me about carbon offsets and if they're good or bad for the climate.”

What's the primary problem?

Write a better version