• Always design the distributed systems to be ‘two mistakes high’ - handle failures at two levels so that there is at least one chance to recover instead of the system failing right away on a mistake.
  • Place the web cache container in a side-car arrangement with each instance of your server/web service container. Any modification to the cache container does not affect the decoupled service.
  • Place the cache above the service containers (or app replicas) so that all the containers can access the same cache replicas, and the cache can call the service in case of a miss.
  • The above two approaches work for stateless services. If state is a significant factor for your app and there are many concurrent connections, sharded caching serves better.
  • Use consistent hashing to distribute the load across multiple cache shards that show up as a single cache proxy to the user.

