Scale11 min read

One server or a cluster: when to add nodes

The temptation to build the "proper" cluster on day one is strong — and almost always harmful. You pay in complexity for load that isn't there yet: orchestration, network latency, distributed bugs, costlier operations.

Start with metrics, not hunches

A single well-tuned server handles surprisingly much. Before scaling, set measurable metrics: p95 latency, CPU load, queue depth, DB response time. Decisions come from graphs, not from a feeling that "it's probably time".

Often it isn't the server that's slow but one heavy query without an index, or an N+1 in the ORM. Profiling can save you an entire cluster.

The order of scaling

First, go vertical (more CPU/RAM) and grab cheap wins: cache (Redis), indexes, a connection pool (PgBouncer), static on a CDN. Then a read replica. And only when you genuinely hit the ceiling — horizontal scaling and sharding.

Every new node must solve a measured problem, not a hypothetical one.

Horizontal scaling adds not just capacity but classes of problems: consistency, idempotency, distributed transactions. Take them on deliberately.

A pragmatic default for most products: one server in Docker, a DB replica, Redis and backups. That carries you to serious volumes — and by then you'll know exactly what hit the limit.

← All posts Start a project →