AuthonAuthon Blog
All articles

#gpu

6 articles tagged with “gpu

Why your ML inference is memory-bound (and how to actually fix it)
debugging

Why your ML inference is memory-bound (and how to actually fix it)

Your ML inference isn't slow because of compute — it's memory-bound. Here's how to diagnose it with profilers and fix it with kernel fusion and quantization.

machinelearningperformancepython
Why Your PyTorch Training Crawls on a Beefy GPU (And How to Fix It)
debugging

Why Your PyTorch Training Crawls on a Beefy GPU (And How to Fix It)

Your GPU sits at 15% utilization and bigger batches don't help? Here's how to diagnose whether you're compute, memory, or overhead bound — and fix it.

pytorchperformancemachinelearning
Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)
debugging

Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)

Why MTP often fails to speed up llama.cpp inference, and how to debug acceptance rate, VRAM pressure, and CUDA graph capture issues.

llmperformancemachinelearning
Why CUDA kernels silently corrupt memory and how to catch the bug
debugging

Why CUDA kernels silently corrupt memory and how to catch the bug

A practical guide to debugging silent memory corruption in CUDA kernels, with compute-sanitizer workflows and a look at Rust-on-GPU tooling.

cudarustdebugging
How to Train a 100B+ Parameter Model When You Can't Afford a GPU Cluster
debugging

How to Train a 100B+ Parameter Model When You Can't Afford a GPU Cluster

Learn how CPU offloading, activation checkpointing, and smart memory management enable training 100B+ parameter LLMs on a single GPU.

machinelearningdeeplearningpython
Hackers Can Now Root Your Machine Through Your GPU. No, Really.
tutorial

Hackers Can Now Root Your Machine Through Your GPU. No, Really.

Two independent research teams disclosed GDDRHammer and GeForge attacks that exploit Rowhammer-style bit flips in GDDR6 GPU memory to break page table isolation and gain full root access to the host machine.

securitygpuhardware
Articles tagged "gpu" | Authon Blog