Designing a microservices architecture for failure

  • 70% of the outages are caused by changes in code. Reverting code is not a bad thing. Implementing change management strategies and automatic rollouts become crucial.
  • Architectural patterns and techniques like caching, bulkheads, circuit breakers, and rate-limiters can help build reliable microservices.
  • Self-healing can help recover an application. You should add extra logic to your application to handle edge cases.
  • Failover caching can help during glitches and provide the necessary data to your application.
  • You should use a unique idempotency key for each of your transactions to help with retries.
  • You can protect resources and help them to recover with circuit breakers. Circuit breakers usually close after a certain amount of time, giving enough space for underlying services to recover.
  • Use chaos engineering methods to test.

Full post here, 11 mins read