Recently I’ve been hard at work building a distributed in-memory cache for our runtime systems. One of the most important decisions that I’ve had to make in this project is the size of the cache.
Our use case includes holding objects for a specific time-to-live (TTL) which means that the cache should be sized so that it can store all the objects for that long without running out of space.
While testing the cache, I was monitoring certain metrics and one particular one caught my eye: Rate of expired keys (yellow) vs rate of evicted keys (green) (stacked)
As you can tell there are a lot more keys being evicted than expired. This shows that the cache is running out of memory and therefore is forced to evict keys. This is proved by taking one cache instance and increasing the maximum memory to some big value. Then observing the memory usage of that instance (orange) compared to the rest of the instances (purple):
So confirmed: if we increase the maximum memory, it seems that more keys are put into the cache. This means that we haven’t sized the max memory correctly.
Once we adjusted the maximum memory to the new levels, you can see that the number of evicted keys (green) droppped to zero and all kicked-out keys were due to the TTL expiration hit (which of course is desired).