Operations 11 min read

How Cursor Scaled Its AI Code Editor: Lessons from Indexing to Object Storage

Cursor, the AI‑powered code editor, grew to handle billions of document queries and over a hundred‑million model calls daily, prompting a multi‑stage infrastructure overhaul that moved from a failing YugaByte setup to PostgreSQL RDS, then to object‑storage‑backed databases, while tackling indexing, inference scaling, and cold‑start challenges.

ITPUB
ITPUB
ITPUB
How Cursor Scaled Its AI Code Editor: Lessons from Indexing to Object Storage

Scale Overview

Cursor processes roughly 1 billion model calls per day and indexes about 10 billion documents daily , accumulating trillions of documents since launch. To support this load the team rebuilt the indexing pipeline, the model‑serving fleet, and the data‑storage layers.

Core Infrastructure Components

Indexing system : Continuously crawls GitHub and other sources to build a searchable representation of user codebases.

Model serving : Handles ~20 000 autocomplete requests per second using a cluster of ~2 000 H100‑class GPUs distributed across US East, US West, London, and Tokyo.

Streaming infrastructure : Persists incoming data for background processing and daily model improvements, separate from the real‑time query path.

Indexing System Evolution

The original prototype required a manual button press, which proved impractical at scale. Indexing is now triggered automatically when the editor opens, unless the repository exceeds a size threshold.

Indexing relies on a Merkle tree representation:

Each file is hashed.

Folder hashes are derived from the hashes of their children.

The root hash represents the entire repository.

When the root hash changes, the system walks the tree to locate modified files, enabling efficient incremental updates without re‑processing unchanged content.

Cursor architecture overview
Cursor architecture overview

Database Evolution: YugaByte → PostgreSQL → Object Storage

Initially the team chose YugaByte for its theoretical unlimited scalability, but the service remained unstable despite heavy investment.

Switching to Amazon PostgreSQL RDS provided immediate stability and performance, supporting the workload for several months.

As data grew to ~22 TB (approaching the RDS 64 TB limit), PostgreSQL’s MVCC‑based update model generated massive numbers of “tombstone” rows. The built‑in VACUUM process could not keep up, leading to transaction‑ID wrap‑around failures and service outages.

Emergency actions—dropping foreign keys, manually deleting large tables, and contacting AWS support—did not resolve the bottleneck. The decisive fix was migrating the largest table (code blocks) to object storage (AWS S3) . This eliminated the vacuum pressure and restored performance.

The experience illustrates a broader trend: building database‑like services on top of highly reliable, scalable object storage (S3, R2, Azure Blob) can avoid many traditional storage complexities.

Database evolution diagram
Database evolution diagram

Inference Service Challenges

Serving AI models at global scale introduces a classic cold‑start problem . When many nodes restart simultaneously, the few nodes that come up first receive the full request traffic, become overloaded, and crash, creating a vicious recovery loop.

Cursor mitigates this by:

Implementing a user‑priority throttling system that limits requests during node recovery.

Adopting a prefix‑based warm‑up strategy (similar to WhatsApp) that gradually restores service for prioritized user groups instead of all users at once.

Conclusion

The scaling journey of Cursor’s infrastructure demonstrates the importance of pragmatic technology choices, continuous monitoring, and willingness to replace theoretically perfect solutions with simpler, more reliable ones. Key lessons include automating indexing with Merkle‑tree incremental updates, migrating from an unstable distributed database to a managed relational store and finally to object storage, and designing inference services that can survive cold‑start scenarios through priority‑based throttling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIindexingdatabasescloudInfrastructurescalingInference
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.