Consistent Hashing and More

16 May 2026

Consistent Hashing, the Karger Paper, and Modern Distributed Systems — Q&A Notes

Original Paper


Q: Origin server will always be a bottleneck if distinct documents are asked for, isn’t it?

Yes.

Caching only helps when there is reuse.

If requests are mostly unique:

cache hit rate -> very low

then every request eventually reaches the origin.

Caching optimizes repetition, not uniqueness.


Q: How does consistent hashing solve this?

It does not solve uniqueness itself.

It solves:

how to distribute ownership/routing cleanly across many nodes

Instead of hierarchical cache trees:

Client -> Local -> Regional -> Root

consistent hashing distributes requests horizontally:

key -> hash -> responsible node

This removes centralized bottlenecks.


Q: What about the Plaxton/Rajaraman algorithm?

Plaxton/Rajaraman introduced distributed routing through a namespace instead of a strict hierarchy.

Requests progressively move closer to the object’s “home.”

Along the path:

This influenced:


Q: What are spread and load in consistent hashing?

Spread

Spread measures:

how many different nodes different clients may map the SAME key to

under inconsistent membership views.

Load

Load measures:

how unevenly objects/requests distribute across nodes

Low load means fewer hotspots.


Q: What is the client/server in this context?

Client

The machine/library making routing decisions.

Server

The storage/cache nodes.

Example in Cassandra:

Application + driver = client
Cassandra nodes = servers

Q: How does Aerospike differ from Cassandra?

Aerospike uses:

key -> fixed partition
partition -> node

instead of classic ring-style consistent hashing.

Partitions are fixed (typically 4096).

Ownership metadata is explicit.


Q: Why did industry move away from pure hash rings?

Because operational issues emerged:

Modern systems often prefer:


Q: Does the client look up multiple nodes?

Usually no.

Normally:

key -> deterministic node

Replication may add fallback nodes, but routing is deterministic.


Q: What if the chosen node doesn’t have the key?

Possible reasons:

Systems may:


Q: What happens if client metadata is stale during rebalance?

Example:

x moved from C3 -> C4

but client still thinks:

x -> C3

Possible behaviors:


Q: How does replication complicate things?

Now the question becomes:

Which replicas own x right now?

instead of:

Where does x live?

This introduces:


Q: How does Cassandra solve this?

Cassandra chooses:

availability > temporary consistency

Uses:


Q: How does Aerospike rebalance?

Aerospike migrates at:

partition granularity

not per-key granularity.

Only partition ownership metadata changes.


Q: Can Aerospike get hot partitions?

Yes.

Hashing balances:

keys

not necessarily:


Q: Can partitions become unevenly large?

Yes.

Hashing balances approximately:

number of keys

not record sizes.

Large objects can create storage skew.


Q: Is the ring even mentioned in the Karger paper?

Not really.

The paper is more abstract and probabilistic.

The modern “hash ring” became popular later as an implementation and visualization strategy.


Q: Why did the paper use multiple hash functions?

The hash family mainly supported:

The final mapping still resolved to:

one key -> one selected bucket

Q: If we want low spread, why not use a single hash?

A single hash works well only if:

all clients see identical membership

Under inconsistent views:

Hash families create probabilistic overlap between client views.


Q: Are the hash functions tuned dynamically?

Usually no.

They are generally:

The system relies on probabilistic guarantees.


Q: How does rebalance work when a new node is added?

Only nearby ownership changes.

Example:

(100,300] moves from C2 -> C4

All other keys remain untouched.


Q: What happens when a node goes down?

Its ownership range is inherited by the next node clockwise.

Example:

C2 dies
C3 inherits C2's interval

Q: If there is no replication and a node dies, is there data loss?

Yes.

Consistent hashing solves:

It does NOT solve:

Without replication:

node death = permanent data loss

Q: What are virtual nodes (vnodes)?

Instead of:

1 physical node = 1 position

use:

1 physical node = many virtual positions

This smooths distribution and reduces imbalance.


Deepest Takeaway

The real core idea is:

small topology changes should cause small ownership changes.

The ring is only one implementation strategy.

Take note