How to Catch LLM Hallucinations Before They Ship to Production

Last week I saw the Reddit thread about arXiv reportedly cracking down on papers full of hallucinated references and fabricated results. I haven't dug into the official arXiv announcement yet, but the underlying problem is one I've been fighting in code reviews for two years now.

The thing is, this isn't just an academic problem. If you're shipping LLM output into docs, code, or pipelines, you've almost certainly merged a hallucination at some point. I have. Twice in the last month, actually. Let's talk about how to catch them before they reach main.

The Problem: Plausible Garbage

Hallucinations from modern LLMs aren't obviously wrong. That's what makes them dangerous. A model will confidently cite numpy.linalg.fast_inverse() (doesn't exist), reference RFC 8954 in a way that sounds reasonable (check it yourself), or generate a requirements.txt with a package version that was never published.

I hit this hard on a project last month. Asked an assistant to scaffold an OAuth flow. It produced a clean, well-commented module that imported requests_oauthlib.OAuth2Session.fetch_token_async. There is no fetch_token_async. The code passed lint. It passed type checks (because we hadn't configured strict mode for that lib). It crashed in staging.

Root Cause: Why This Happens

LLMs are next-token predictors trained on patterns. When asked about a function, they generate something that looks like a function call in that library's idiom. They don't have a runtime concept of "this symbol exists." Their internal representation is closer to "tokens that frequently follow requests_oauthlib.OAuth2Session." — and fetch_token_async is a plausible continuation even if no one ever wrote it.

The pattern shows up in three flavors:

Phantom APIs: methods, flags, or modules that look right but don't exist
Wrong-version APIs: methods that exist in v3 but not v2 (or vice versa)
Fabricated citations: URLs, RFC numbers, paper DOIs, GitHub repos that 404

The fix isn't "use a better model." The fix is to stop trusting unverified output.

Step 1: Run Imports and Symbols Through a Real Interpreter

The cheapest hallucination detector is the language itself. Before merging any LLM-generated Python, I shove the imports through a quick check:

python

import importlib
import inspect

# Symbols claimed by the LLM-generated module
claims = [
    ('requests_oauthlib', 'OAuth2Session.fetch_token_async'),
    ('httpx', 'AsyncClient.stream'),
]

for module_name, dotted_path in claims:
    module = importlib.import_module(module_name)
    obj = module
    for part in dotted_path.split('.'):
        # getattr raises AttributeError on phantoms
        obj = getattr(obj, part, None)
        if obj is None:
            print(f'MISSING: {module_name}.{dotted_path}')
            break
    else:
        # inspect.signature confirms it is actually callable
        print(f'OK: {module_name}.{dotted_path} -> {inspect.signature(obj) if callable(obj) else type(obj)}')

For JS/TS, the equivalent is tsc --noEmit with strict on, plus a quick node --check. Type errors catch most phantom APIs because the LLM-invented method has no declaration in the .d.ts file.

Step 2: Validate Every URL and Reference

I keep a tiny script that scans markdown for links and hits each one. It's saved me from at least a dozen embarrassing PRs.

python

import re
import asyncio
import httpx

LINK_RE = re.compile(r'\[(?P<text>[^\]]+)\]\((?P<url>https?://[^\)]+)\)')

async def check(url: str, client: httpx.AsyncClient) -> tuple[str, int]:
    try:
        # HEAD first; some sites reject it, so fall back to GET
        r = await client.head(url, follow_redirects=True, timeout=10)
        if r.status_code >= 400:
            r = await client.get(url, follow_redirects=True, timeout=10)
        return url, r.status_code
    except httpx.HTTPError as e:
        return url, -1  # treat network errors as suspect

async def main(path: str) -> None:
    text = open(path).read()
    urls = [m.group('url') for m in LINK_RE.finditer(text)]
    async with httpx.AsyncClient() as client:
        results = await asyncio.gather(*[check(u, client) for u in urls])
    for url, status in results:
        if status != 200:
            print(f'BROKEN ({status}): {url}')

asyncio.run(main('docs/post.md'))

Run this on every doc PR and most fabricated citations die immediately. The model will sometimes hallucinate a real-looking GitHub URL pointing at a repo that doesn't exist — this catches it.

Step 3: Pin the Model to Reality with Retrieval

The deeper fix is to stop letting the model generate from memory alone. Retrieval-augmented generation works because you force the model to ground its answer in actual source material you fetched.

The simplest version: before asking the LLM to write code against a library, pull the real docs or source and put them in the prompt.

python

import httpx

def fetch_docs_for(package: str, version: str) -> str:
    # Pull the actual installed package metadata instead of trusting memory
    url = f'https://pypi.org/pypi/{package}/{version}/json'
    info = httpx.get(url, timeout=10).json()
    return info['info']['description']  # often includes API surface

context = fetch_docs_for('requests-oauthlib', '2.0.0')
prompt = f"Using ONLY the API documented below, write a token refresh helper.\n\n{context}\n\nTask: ..."

Is this airtight? No. Models still drift even with context. But the hallucination rate drops sharply when the actual function signatures sit in the prompt window.

Step 4: Add a CI Gate

The verification only helps if it runs every time. Wire the import check and link check into CI so a forgetful afternoon doesn't slip a phantom past you:

yaml

name: verify-llm-output
on: [pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install -r requirements.txt
      - run: python tools/check_symbols.py
      - run: python tools/check_links.py docs/

I made this mistake early on: I ran the checks manually "when I remembered." I remembered roughly half the time. Automation removes the willpower tax.

Prevention Tips That Actually Work

A few habits that have cut my hallucination rate dramatically:

Always request citations in the same response. "Generate the function AND quote the exact line from the docs that defines the method you used." If the model can't quote it, the method probably isn't real.
Pin the library version in the prompt. "Using httpx==0.27.0" gives the model fewer plausible-but-wrong APIs to draw from.
Be suspicious of anything that ends in _async, _v2, or _safe. These are common hallucination shapes — the model invents a parallel method because it pattern-matches API naming.
Run generated code in a sandbox first. Even a quick python -c 'import generated_module' catches phantoms before they get reviewed.
Treat LLM output like a junior PR. You wouldn't merge a junior dev's first patch without running it. Same standard.

The Takeaway

arXiv's reported policy isn't surprising — it's the natural response to a flood of unverified text. The same pressure is going to hit codebases, technical docs, and internal wikis. The teams that handle it well aren't the ones that ban LLMs; they're the ones that build cheap, automatic verification into the normal review loop.

The checks above take maybe two hours to set up. They've caught more bugs in my last quarter than any linter I've ever configured.