Skip to main content

Command Palette

Search for a command to run...

The Thundering Herd Problem

Updated
3 min read
The Thundering Herd Problem

Imagine you are sitting on a farm and then suddenly a bull start running towards you.

What would you do? Easy “I would just get out of the way” you would say. Well how about now?

If it’s not clear it’s a herd of raging bulls. Unless you have extreme luck you would likely be trampled. That's the exact situation of the server when any hot key expires in our system cache and suddenly thousands of requests hit our server, missing the cache and directly hammering the database. As a result, the database becomes overwhelmed, eventually causing it to crash and bringing down entire system. This is commonly known as “thundering herd” problem.

To prevent this issue, we use several caching strategies:

  1. Jitter on TTL

  2. Mutex

  3. Stale while revalidating

  4. Cache Warming

Jitter on TTL

In context of caching, jitter refers to introducing a random variation in the expiration time of the cache. This is done to ensure that all cached item instead expiring at the exact same time they expire at the slight different time due to added jitter.

Mutex

When a cached item expires and multiple requests try to fetch it at the same time then a mutex lock ensure that only one of those requests go to database to recompute it's value and fill the cache while all the other requests either wait or server stale them.

Stale while revalidating

It serves the cache content immediately to the users (even if it's immediately expired) while simultaneously updating the cache with new data in the background.

It introduces the three different stages for cached content:

  • Fresh: content that has been recently cached, and is good to serve to users

  • Stale: content that is expired but still acceptable temporarily.

  • Rotten: content too old to be trusted.

Two variables define the boundaries between these states:

  • max-age: this sets the limit between fresh and stale. Requests older than this age trigger a cache update.

  • stale-while-revalidate: this sets the limit between stale and rotten. Only requests older than this age are fulfilled direct from the database or origin.

Cache-Control: max-age=60, stale-while-revalidate=300

This means that:

  • if the cache contents are less than 60 seconds old, just serve cached content with no further action.

  • if the contents are older than 60 seconds, then for the next 300 seconds, we can display cached content if requested, but quietly update the cache for the next visit.

  • if the contents are older than 360 seconds, ignore the cache contents, serve content direct from origin, and update the cache as well.

Cache Warming

Cache warming means preloading data into the cache before real user request it in order to reduce sudden database spikes if multiple users request it at the exact same time.

References