Caching — Part 1 — The Theory

While caching requires no introduction, but for the ones freshly embarking on the path of enlightenment, Caching is the process of data in a short-term temporary location (memory buffer) called a “cache”. Why do we require it, well caching is your first go-to option, in case you are looking at enhancing the performance of your application, be it a web, mobile, or desktop application.

Some Basics of Caching

Data in the cache is generally stored in fast access hardware, generally, that is the RAM of the system on which your application is running. Caching could be divided into two types 1) In-Process 2) Distributed.

Caching on a single JVM Environment

This means your cached data would be stored on the same machine, where your single node application runs. In many scenarios the cache data will share the same location (In process caching) as your application data, so running into “Out Of Memory Exception” is not that rare. In these cases, you have not much option than to “vertically” scale your application. This means an upgraded RAM, a beefed-up hard disk, which makes it expensive and has limits.

Caching on a multi JVM Environment

Here you host your application on a multi-node “horizontally” scaled infrastructure. This means you would have multiple VMs hosting multiple instances of your application and each of these nodes runs a version of the cache. This does bring in a problem of eventual consistency because at some time, you would want to persist your data to the storage layer, and you might have to come up with a strategy of syncing your cache data before storage writes, without compromising the quality of data.

A flow chart of basic chaching strategy and In-Process Cache on a single and multi-node system

Before we move to the distributed cache, here are some quick points to remember while choosing “In-Process Caching” for a singular or distributed environment

  • Works best when you have a single node application and your application remains within the limit of viable vertical scaling options
  • When you have to worry about caching only static content, metadata, or compute results
  • Since the cache shares memory with the application, there are frequent GC which can be a constraint on machines having restricted resources
  • Modern cache frameworks, like EHCache, have an off-heap feature, which essentially means that objects are stored and managed “Off” (outside) heap memory. Which means they would not come under the purview of the Garbage Collection. “Off heap” cache would be slower than on-heap cache, but faster than disk storage.
  • The in-process cache can also be your go-to option on a distributed environment if all you are trying to store is static or metadata or application compute like objects, or you do not face a mammoth task of synchronizing your cache write to disk to protect data integrity.

In-Process caching comes with its baggage of issues, which can be either be solved or lived with. But one limitation that hits the option hard is the ability to cache a substantial volume of data. Even in multi-node scenario, you are creating the same initial copy of the cache on multiple systems, which have their own limitation of handling hardware (memory) utilization between the application and the cache that it is hosting.

By definition, a distributed cache is a system that pools RAM from multiple networked computers into a single in-memory data store. This gives the ability to scale out and grow beyond the memory limits of a single computer by linking multiple systems.

What makes distributed caching your go-to option?

  • High volume of data to be cached to meet the transaction performance requirements. You might want to hold your data in a cache and speed up your transactions, and eventually persist them periodically bypassing the need to always hitting the storage layer to complete a transaction.
  • Storing session data like cart details, recommendation, etc, for a retail engine, working on a large number of concurrent web sessions.
  • If you want your cache to act as a temporary fail-safe option, in case of events like the storage layer abruptly going down.
  • Extreme scaling
Distributed cache. Source:*AN3YSvouTHeDwBrpa6ZR9w.jpeg

Distributed cache derives its foundation from distributed architecture. In the midst, you have a caching server, that pools memory from the RAM of the connected machines. Cached data is distributed in this pool of memory. As the volume of data grows addition of newer machines to the cluster helps in scaling out. The cache uses hashing to identify which node contains the data that is being searched for. Distributed caches do not share the same memory as your application, so you don’t have to worry about frequent GC cycles running.

Distributed caches overhead:

  • Distributed caches do come with the overhead of network latency and an ignorable concern of the source node identification algorithm, that identifies the node that has the queried data
  • You typically use distributed caches for a higher volume of data to be cached, and though this requirement seems transient, you would still want to work on strategies of making your caching mechanism available and resilient (though most of the time such issues are taken care of by the cache provider that you are using)

By now if you have made up your mind on which cache to use and what data to cache, it’s time to take into consideration the governing eviction policy around your cache.

Cache is not a database service, the data stored has a transient lifespan. So you have to come up with strategies for identifying what data should be evicted from the cache so that new data can be pushed in and memory can be efficiently utilized.

Some of the widely used cache eviction policies are as below

  • FIFO (first in first out) and LIFO (last in first out)
  • LRU (Least Recently Used)
  • TLRU (Time aware LRU), LRU with a lifetime
  • MRU(Most recently used) Discarding most recently used items
  • RR (Random Replacement)
  • SLRU (Segmented LRU) You have two segments of cache — probationary and protected, a new item would be shifted to the protected segment from probationary depending on ask count
  • LFU (Least Frequently Used)

To implement cache eviction strategies you have to keep track of things like what goes in, how many times it’s been accessed and when was the last time the cached object is accessed.

For the second part of this series, I work on implementing an “In — Process” EHCache example using spring.

Stay tuned !!

In the path of the binary enlightenment, amateur photographer and trying my hand in writing now !! a