AuthonAuthon Blog
debugging6 min read

Why your IDE crawls on huge codebases (and how to actually fix it)

Why language servers and file watchers fall over on large codebases, and the concrete tuning steps that bring your IDE back to life.

AW
Alan West
Authon Team
Why your IDE crawls on huge codebases (and how to actually fix it)

The problem: you open the project and your laptop starts levitating

We've all been there. You git clone some hefty repo, open it in your editor, and within ninety seconds your fans are spinning like a small turbine. The language server is pegged at 100% CPU, file search takes ten seconds, and autocomplete shows up about four words after you've already moved on.

I hit this last month on a codebase north of 2 million lines. The first time I opened it cold, my editor crashed twice before it finished indexing. Reading the old post on Google's IDE history reminded me that this is a very old problem — and most of us are still solving it the wrong way.

Let me walk through what's actually happening, why the defaults betray you, and the handful of fixes that genuinely move the needle.

Root cause: three things are fighting for your CPU

When you open a project, three subsystems boot up roughly in parallel:

  • The file watcher — registers a kernel-level watch on every directory so it can react to changes
  • The indexer — walks the tree, parses files, builds a symbol table for go-to-definition and search
  • The language server — spins up per-language (TypeScript, gopls, rust-analyzer, etc.) and does its own analysis on top

Each of these scales poorly with repo size, and they don't coordinate well. The file watcher alone can hit OS limits — on Linux, the default fs.inotify.max_user_watches is often 8192 or 65536, and large monorepos blow through that without warning. When it fails, you get a silent degradation: stale files, missing change events, weird ghost diagnostics.

Here's the quickest way to check if you've already hit the inotify ceiling:

# See current limit
cat /proc/sys/fs/inotify/max_user_watches

# Count watches currently in use across all processes
find /proc/*/fd -lname 'anon_inode:inotify' 2>/dev/null \
| xargs -I{} readlink {} | wc -l

If you're anywhere near the limit, your editor is silently losing change events. That's bug #1.

Step 1: stop watching things you don't care about

The single biggest win is telling the indexer and watcher to ignore directories that contain a lot of files but almost no useful symbols. Build artifacts, node_modules, generated protos, vendored dependencies — these inflate the watch count and slow every search.

Most editors honor a .gitignore-style file or have their own exclusion config. The pattern matters more than the syntax. For a typical polyglot project, I exclude roughly this set:

# Build outputs
/dist/
/build/
/target/
/.next/

# Dependency trees
/node_modules/
/vendor/
/.venv/

# Generated code (still indexable on demand)
*/.pb.go
*/_pb2.py
/generated/

# VCS and caches
/.git/
/.cache/

One nuance: excluding generated code from the indexer is fine as long as your language server still resolves symbols against it on demand. For Go, gopls will follow imports into excluded directories. For TypeScript, you usually need to keep node_modules reachable to the LSP even if you hide it from search. Test this — don't assume.

Step 2: raise the file watcher limit (Linux)

This is a one-liner, but I see people skip it for years and just live with broken hot-reload:

# Temporary (until reboot)
sudo sysctl fs.inotify.max_user_watches=524288

# Permanent
echo 'fs.inotify.max_user_watches=524288' \
| sudo tee -a /etc/sysctl.d/99-inotify.conf
sudo sysctl --system

524288 is overkill for most people, but watches are cheap (about 1KB of kernel memory each), and the headroom prevents the silent-failure mode. On macOS the equivalent knob is kern.maxfiles and kern.maxfilesperproc, but the failure mode there usually shows up as "too many open files" errors instead of silent drops.

Step 3: tune the language server itself

This is where it gets language-specific, and where the biggest hidden wins live. A few that I've personally validated:

  • gopls — set GOFLAGS=-mod=readonly to skip module graph mutation, and check GOMEMLIMIT if you're on Go 1.19+ to keep the server from thrashing GC
  • rust-analyzer — disable cargo check on save for large workspaces and rely on clippy runs in CI instead; the savings are dramatic
  • TypeScript — split a monorepo into TypeScript project references so the server doesn't reparse the world on every edit
  • Pyright/pylance — use pyrightconfig.json with explicit include and exclude rather than letting it crawl from the root

For TypeScript specifically, project references are the closest thing to a free lunch I've found. A minimal tsconfig.json at the root looks like this:

{
"files": [],
"references": [
{ "path": "./packages/core" },
{ "path": "./packages/api" },
{ "path": "./packages/web" }
]
}

Each referenced package gets its own tsconfig.json with "composite": true. The compiler caches per-project output, so editing a file in web doesn't force a full reanalysis of core. After migrating a medium-sized monorepo to this setup, our cold-start time dropped from around 45 seconds to under 10.

Step 4: when all else fails, narrow the workspace

If you're working in one corner of a massive repo, just... don't open the whole thing. Most editors support multi-root workspaces or git sparse-checkout. Sparse checkout in particular is underused:

# Initialize sparse mode in cone mode (faster pattern matching)
git sparse-checkout init --cone

# Pick the directories you actually need
git sparse-checkout set services/auth shared/proto

Now your working tree only contains those paths. The indexer has less to chew on, the watcher registers fewer paths, and your editor stops trying to be helpful about a million files you'll never touch.

Prevention: a few habits that compound

After doing this dance on a handful of projects, here's what I now do up front on any new repo:

  • Add an editor exclusion file (.vscode/settings.json, .idea/, whatever) to the repo, not just my dotfiles, so teammates get the same baseline
  • Audit the watch count after the first index pass — if it's over 50k, something's wrong
  • Treat "my editor is slow" as a bug with a root cause, not a fact of life — it almost always traces back to one specific subsystem
  • Keep generated code in a clearly-named directory so it's trivial to exclude

None of this is exotic. It's mostly just refusing to accept the defaults. The defaults assume a small project; your project isn't small. Tell the tools the truth and they'll usually behave.

Why your IDE crawls on huge codebases (and how to actually fix it) | Authon Blog