AuthonAuthon Blog
All articles

#performance

27 articles tagged with “performance

Why self-hosted ebook servers choke at 150k books (and how to fix it)
debugging

Why self-hosted ebook servers choke at 150k books (and how to fix it)

Self-hosted ebook servers often break past 50k books. Here's why the database is usually the bottleneck and how to fix indexing, search, and metadata at scale.

selfhosteddatabaseperformance
Why your ML inference is memory-bound (and how to actually fix it)
debugging

Why your ML inference is memory-bound (and how to actually fix it)

Your ML inference isn't slow because of compute — it's memory-bound. Here's how to diagnose it with profilers and fix it with kernel fusion and quantization.

machinelearningperformancepython
Why Your PyTorch Training Crawls on a Beefy GPU (And How to Fix It)
debugging

Why Your PyTorch Training Crawls on a Beefy GPU (And How to Fix It)

Your GPU sits at 15% utilization and bigger batches don't help? Here's how to diagnose whether you're compute, memory, or overhead bound — and fix it.

pytorchperformancemachinelearning
How to cut Node.js memory usage by 40% in self-hosted apps
debugging

How to cut Node.js memory usage by 40% in self-hosted apps

A walkthrough of debugging high memory usage in a Node.js service running on a small VPS, with three concrete fixes that added up to a 40% RSS reduction.

nodejsperformanceselfhosted
How to fix OOM crashes when running large open-source LLMs locally
debugging

How to fix OOM crashes when running large open-source LLMs locally

Why local LLM inference hits OOM errors even when the model 'fits' in VRAM — and how to fix it with quantization, KV cache tuning, and allocator config.

llmpythonmachinelearning
How to Fix Slow Page Loads Caused by Third-Party Scripts
debugging

How to Fix Slow Page Loads Caused by Third-Party Scripts

When third-party scripts wreck your Core Web Vitals, here's how to find the worst offenders and fix the slowdown without rewriting your app.

webperfjavascriptfrontend
Why 'x time ago' is broken everywhere and how to actually fix it
debugging

Why 'x time ago' is broken everywhere and how to actually fix it

Relative timestamps like '2 hours ago' have been quietly breaking across the web. Here's the root cause and a step-by-step fix using Intl.RelativeTimeFormat.

webdevjavascriptfrontend
Why your 27B model won't fit on 24GB VRAM (and how to actually fix it)
debugging

Why your 27B model won't fit on 24GB VRAM (and how to actually fix it)

Why 4-bit 27B models still OOM on 24GB cards, and the quant + KV cache + backend settings that actually let them fit.

llmmachinelearningperformance
Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)
debugging

Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)

Why MTP often fails to speed up llama.cpp inference, and how to debug acceptance rate, VRAM pressure, and CUDA graph capture issues.

llmperformancemachinelearning
Why your IDE crawls on huge codebases (and how to actually fix it)
debugging

Why your IDE crawls on huge codebases (and how to actually fix it)

Why language servers and file watchers fall over on large codebases, and the concrete tuning steps that bring your IDE back to life.

productivitydevopstooling
How to fix slow JavaScript builds before reaching for a Rust rewrite
debugging

How to fix slow JavaScript builds before reaching for a Rust rewrite

A practical guide to debugging slow JavaScript builds before rewriting your toolchain. Profile first, find the real bottleneck, then fix it.

javascriptperformancewebdev
Why your React Three Fiber gallery drops to 5 FPS and how to fix it
debugging

Why your React Three Fiber gallery drops to 5 FPS and how to fix it

A practical guide to debugging FPS drops and memory leaks in React Three Fiber galleries — covering draw calls, instancing, and proper disposal.

reactthreejswebgl
How to fix native module errors when switching JavaScript runtimes
debugging

How to fix native module errors when switching JavaScript runtimes

Native modules silently break when you switch JavaScript runtimes. Here's how to diagnose ABI mismatches and rebuild safely without losing a weekend.

javascriptnodejsdebugging
Why your Node.js memory keeps climbing in production (and how to find the leak)
debugging

Why your Node.js memory keeps climbing in production (and how to find the leak)

A practical guide to diagnosing and fixing memory leaks in Node.js production services using heap snapshots, with concrete code examples.

nodedebuggingperformance
How I cut a 282-hour exact solver down to 22 minutes
debugging

How I cut a 282-hour exact solver down to 22 minutes

Walking through the formulation, branching, and symmetry fixes that took a minimum line cover ILP from 282 hours down to 22 minutes.

algorithmsoptimizationpython
Why Google reCAPTCHA is breaking your site (and how to actually replace it)
debugging

Why Google reCAPTCHA is breaking your site (and how to actually replace it)

Google reCAPTCHA can silently break your signup flow. Here's how to diagnose the failure and replace it with a proof-of-work challenge you control.

webdevsecurityjavascript
TokenSpeed and the Quiet Race to Make LLM Inference Boring
tutorial

TokenSpeed and the Quiet Race to Make LLM Inference Boring

A grounded look at TokenSpeed, the new LLM inference engine trending on GitHub, plus a practical benchmark you can actually run yourself.

llmmachinelearningperformance
Why cross-platform desktop apps balloon to 200MB and how to slim them down
debugging

Why cross-platform desktop apps balloon to 200MB and how to slim them down

Bundled-runtime desktop apps pay for a full browser per install. Here's why that happens and how to replace it with the OS's native webview.

webdevdesktopperformance
Why local LLM inference stalls on Apple Silicon (and how to fix it)
debugging

Why local LLM inference stalls on Apple Silicon (and how to fix it)

Local LLM inference on Apple Silicon often runs at a fraction of what the hardware can do. Here's why — and how to fix it with kernel fusion, KV cache layout, and the right quantization.

machinelearningperformancemetal
Why Your Site Is Slow on Shared Hosting and How to Fix It with a VPS Migration
debugging

Why Your Site Is Slow on Shared Hosting and How to Fix It with a VPS Migration

How to migrate from shared hosting to a VPS — a step-by-step guide covering server setup, data migration, Nginx config, and the performance gains you can expect.

webdevdevopslinux
We Moved Our API from Node to Bun. Here's What Broke (and What Got 3x Faster).
tutorial

We Moved Our API from Node to Bun. Here's What Broke (and What Got 3x Faster).

We moved our production API from Node.js to Bun. Some things broke, some got 3x faster. Heres the honest breakdown.

bunnodejsjavascript
Your Node.js App Uses 1,000,000x More RAM Than Voyager 1. Fix It.
debugging

Your Node.js App Uses 1,000,000x More RAM Than Voyager 1. Fix It.

Debug and fix Node.js memory leaks with heap snapshots, bounded caches, and proper listener cleanup — inspired by Voyager 1's 69 KB constraint.

nodejavascriptperformance
Why Your Measurement Tools Might Be Corrupting Your Data
debugging

Why Your Measurement Tools Might Be Corrupting Your Data

How measurement tools can contaminate the data they collect — lessons from microplastics research applied to software observability and benchmarking.

datasciencepythonperformance
Why Your Video Player Is Bloating Your Bundle (and How to Fix It)
debugging

Why Your Video Player Is Bloating Your Bundle (and How to Fix It)

Video.js v10 beta dropped an 88% size reduction. Here's why the old version was bloated and how to migrate to the leaner rewrite.

javascriptwebdevvideojs
Why Windows Games Stutter on Linux and How Wine 11 Finally Fixes It
debugging

Why Windows Games Stutter on Linux and How Wine 11 Finally Fixes It

Wine 11 rewrites syscall dispatching and WoW64 handling to eliminate PE/ELF boundary overhead. Here's why Windows games stuttered on Linux and how the fix works.

linuxwinegaming
Cursor Just Made ripgrep Look Slow. Here's How.
tutorial

Cursor Just Made ripgrep Look Slow. Here's How.

I've been using `ripgrep` for years. It's the kind of tool that makes you feel smug about your workflow -- blazing fast, zero complaints. Then Cursor'

cursorregexperformance
Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)
debugging

Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)

How large-scale object stores serve petabytes per second from slow HDDs using erasure coding, massive parallelism, and smart data placement.

distributed-systemsstorageperformance
Articles tagged "performance" | Authon Blog