Authon Blog

Thoughts on authentication, developer tools, and building secure applications.

debuggingMay 27, 2026

Why your quantized LLM loses its MTP heads and how to keep them

Quantizing a model with multi-token prediction heads? Here's why standard conversion pipelines drop them silently, and how to preserve and calibrate them.

machinelearningllmpythonquantization

debuggingMay 27, 2026

How to build reliable geo-restrictions that actually hold up in production

Geo-restrictions look simple until you ship them. Here's how to build jurisdiction-based access controls that survive VPNs, mobile carriers, and CDN caching.

webdevsecuritybackend

comparisonMay 26, 2026

Privacy-Focused Analytics Compared: Umami vs Plausible vs Fathom

An honest comparison of Plausible, Fathom, and self-hosted Umami after migrating four production projects off Google Analytics 4.

webdevprivacyanalytics

debuggingMay 26, 2026

Why your VPS might be part of a botnet — and how to find out

How to detect when your servers have been compromised into attack infrastructure, with a step-by-step debugging walkthrough using ss, auditd, and nftables.

securitydevopslinux

debuggingMay 26, 2026

How to Fix Tool-Use Loops in Autonomous Coding Agents

Autonomous coding agents love getting stuck in tool-use loops. Here's why it happens and four concrete fixes that stop the bleeding.

aiagentspython

debuggingMay 26, 2026

How to Work Around MySQL's View Subquery Limitation (Bug #11472)

MySQL's 20-year-old view subquery restriction (Bug #11472) finally has a reported fix. Here's how to refactor views with CTEs and nested views today.

mysqldatabasesql

debuggingMay 26, 2026

Why your browser multitrack audio drifts out of sync (and how to fix it)

Multitrack audio playback in the browser drifts because <audio> elements don't share a clock. Here's how to use the Web Audio API to fix it.

webaudiojavascriptwebdev

debuggingMay 25, 2026

Why LLM Coding Agents Drift on Long Back End Tasks (and How to Fix It)

LLM coding agents quietly drop constraints as tasks get longer. Here's why it happens and a concrete pattern for keeping back end code generation honest.

aillmbackend

debuggingMay 25, 2026

How to Fix Context Loss in Multi-Step AI Agent Workflows

Why AI agents lose context across multi-step tool calls and a concrete scratchpad pattern to fix it, with code examples.

aiagentspython

debuggingMay 25, 2026

How to do partial page updates without shipping a framework

Native partial DOM updates are surprisingly hard. Here's why libraries like HTMX exist, what Chrome is reportedly exploring, and how to handle it cleanly today.

webdevhtmljavascript

comparisonMay 25, 2026

Migrating off Google Analytics: Umami vs Plausible vs Fathom

A practical comparison of Umami, Plausible, and Fathom for teams migrating off Google Analytics, with code examples and self-hosting notes.

webdevprivacyanalytics

debuggingMay 25, 2026

Why self-hosted ebook servers choke at 150k books (and how to fix it)

Self-hosted ebook servers often break past 50k books. Here's why the database is usually the bottleneck and how to fix indexing, search, and metadata at scale.

selfhosteddatabaseperformance

debuggingMay 25, 2026

Why your ML inference is memory-bound (and how to actually fix it)

Your ML inference isn't slow because of compute — it's memory-bound. Here's how to diagnose it with profilers and fix it with kernel fusion and quantization.

machinelearningperformancepython