With the continued growth and transition to microservices, it’s important to ensure that the time and money re-engineering systems to modern, cloud-based solutions lead to tangible benefits to the organization. In this multi-part series, we’ll look at different components and pitfalls that need to be considered when modernizing to microservices.
In this blog, we’ll look at why it’s important to consider using a shadow release strategy.
The Value of A Shadow Release Strategy
Does your test environment simulate the complete variance in production? Does it simulate the total traffic volume, the volume spikes, and the diversity of all requests? If you’ve answered yes to all questions, you are able to replay a realistic production load in a test environment. If not, a shadow release strategy will be invaluable. If done properly, a shadow release will catch the majority of problems you would find in production. Even though it delays turning on the lights, this lowers the risk tremendously and assures that once turned on, they will stay on.
A shadow release is where the new system is deployed alongside the existing legacy application, while still using the legacy application for the production responses to client requests. The ingress traffic is duplicated to reach both the legacy system and the new system. Results from calling the new system are ignored and not relied upon to respond to production traffic, but can be stored and compared offline with the responses going out to production.
Two broad aspects need verification in a shadow release environment: 1) functionality (am I getting the correct responses?), and 2) reliability/performance (response time, system’s horizontal scaling and elasticity, error rate, etc).
How to Make Your Shadow Release Useful
For a shadow release to be truly useful, we recommend:
- Work up to mirroring 100% of the traffic. Start lower, sampling traffic, especially when sharing databases between the old and new systems. Ultimately, a 100% sampling rate is what will give you confidence that your solution can scale and handle the traffic. Database backends should not be modifiable by the new microservices to avoid side effects.
- If there are concerns about shared databases preventing you from increasing the sampling rate for fear of affecting production, start by making the sampling dial configurable up/down/off live in production (please don’t inject the properties at build time — and if you do, have a separate way to turn the dial in production). Then, involve the database team and monitor the database load and response times as you increase the sampling dial. If the new microservices are read-only, consider the use of separate read-replicas for the database to isolate the new microservices’ traffic so you can be completely confident that you’re not impacting production.
- Monitor. This is a great opportunity to make sure your teams have the proper tools to troubleshoot production problems before they are actually used in production, reducing risk and increasing the speed of resolution of future issues.
- Compare production and shadow outputs. Either write quick-and-dirty comparison tools to see if the two flows return the same results, or manually sample the results and compare them.
Need to catch-up? Previously, lessons included:
Part 1: The Importance of Starting with the Team
Part 2: Defining Ownership
Part 3: Process Management and Production Capacity
Part 4: Reserving Capacity for Innovation
Part 5: Microservices Communication Patterns
Part 6: Using Shadow Release Strategy
Part 7: Performance Testing Microservices
Part 8: Memory Configuration Between Java and Kubernetes
Part 9: Prioritizing Testing within Microservices
Part 10: Distributed Systems
Ready to modernize your organization’s microservices? Oteemo is a leader in cloud-native application development. Learn more: https://oteemo.com/cloud-native-application-development/