The Case for Replicant Production
Software systems development has changed from being something we think of as an engineering and planning challenge, like building a building or a bridge, into a highly incremental and experimental process of constantly trying new things and seeing how they work or don’t work.
The key to this new approach is to get valid feedback whenever you try something new. This is fundamentally why agile processes call for the customer (or a proxy) to actually join the software development team and work with them every day.
This isn’t easy, particularly since many of today’s software products have very large numbers of users and/or customers and a single product owner may have a difficult time representing the whole community. More and more real customer feedback is required and that means changing production servers and presenting altered software directly to real customers. But there are ways to do that without always having to build new systems from scratch.
Developers don’t write more code or write code faster when they use agile processes, yet that style of development has become known for greater speed and efficiency. Why is that? More than anything, it comes from agile enabling developers to spend less time doing the wrong thing(s).
By putting extra effort into putting features into a releasable state—and showing them to customers—we can get valid feedback on our efforts, and we avoid working on the things that don’t really matter. For most waterfall-type projects, an estimated 80% of functionality ends up not being used by customers. What we have learned is that feedback is the key, and the more frequently you can get it the better. To this end, more and more applications are being released in a highly incremental fashion.
Modern applications that are incrementally released have a huge advantage
Why revamp software development to constantly deliver new code? Why not take out time and test more thoroughly and let our ideas come to full fruition before releasing them as features? As mentioned before, because we might be wrong and on the wrong track altogether, and this saves us lots of wasted effort.
Another reason is that conditions are changing at a faster and faster pace in just about every industry and every market. Turning on a dime (for a dime…) in today’s world is a truly strategic advantage. By constantly releasing new ideas and getting immediate feedback, organizations can beat their competition if they can be even a little bit faster.
Traditional applications are difficult-to-change monoliths
Most software that exists today was not built with high-changeability and adaptability as key design goals. Most were created with a lot of hard work and really painful lessons. Scalability and reliability often came over time by fixing defects and tinkering with production environments, often without a lot of emphasis on documenting what went into the process. Organizations faced with taking existing software and suddenly changing it so that it can be incrementally released are often faced with three major tasks to complete their architectural transformation.
- Replicate the whole production environment for testing purposes.
- Automate tests so they can run fast enough and often enough.
- Break a traditional application into separately replaceable and deployable pieces*
*Fully going to microservices and containers is typically pretty ambitious, but breaking the monolith into more manageable parts is usually possible.
Modern internet applications are complex…and getting them into a scalable production state is not easy
Back in the day, software was a single executable that resided on a single server. Today’s applications require a complex environment including firewall, load balancing, message queuing, application servers and a variety of databases and other persistence mechanisms.
Appropriately testing your applications means making sure all these components still work together, and scalability and accurate performance testing are often not feasible. For many organizations, the prospect of constantly pulling apart their carefully constructed production environment to introduce very small changes feels like a nightmare.
What does the cloud mean for most organizations?
When most organizations look at cloud technologies, they are trying to decide whether or not to take the leap of moving their entire applications to the cloud. In some cases, organizations just use the cloud for development and test environments, and many of Skytap’s current customers fall into this category. But most often, cloud services like AWS and Azure are being considered as a more modern place to run production applications, typically the newer ones that have been designed from scratch to run in the cloud. Moving traditional applications over to these services tends to be difficult and often leads to disaster. Cloud reliability and focus on disposable servers runs afoul of typical legacy architectures and assumptions.
Using a cloud to augment rather than replace your production environment
There is another approach to modernizing traditional applications in production, that is to build a clone of an existing legacy application with any needed connections to existing in-house systems and to run it in parallel to the original application being hosted on-premises. The cloud “replicant” application becomes a safe space for making changes needed to modernize the code and to quickly introduce new features and other changes.
The primary application still runs on-premises and may still deliver most of the application load while generating revenue or whatever other function the organization needs. This provides basic protections from the risks of change.
The cloud replicant can be the production test-bed, allowing the organization, in a modern way, to offer up new features, offer customers new choices (via A-B testing and other techniques), and work to break up the traditional application into manageable and smaller pieces. It’s embracing the future with a safety net.
Challenges with this approach
Of course, all of this is not simple. The original traditional application will probably need to share data and transactions with the new replicant version. Connections from the cloud replicant to on-premise systems will probably have particular latency needs, although another possible technique might be to cache data in the cloud to increase performance and to sync data back to the on-premise systems after the fact.
Data syncing between existing applications and new cloud-based augmented features will need to be performed and care will need to be taken to ensure that schema additions for new features do not conflict with existing data structure. Caching of data close to the cloud-based feature will enhance performance and make syncing and data transformations after-the-fact possible.
What makes replicant production worth doing?
80-90% of IT spending is going towards the care and feeding of legacy applications, yet new requirements are coming at an increasing rate. Many development groups feel like they have to do new work from scratch in order to be fast enough. By using replicant production techniques, organizations can do modern, high speed, high change development alongside legacy code that continues to work. No approach is perfect but this one balances the risk and reward using cloud technology to enable the experiments that just about every organization needs to conduct in today’s world.
There is no need to only do modern experimentation and feature-based A-B testing only with brand new greenfield applications; it’s worth extending traditional applications to the cloud to take advantage of these techniques as well.
Check out previous chapters of our ongoing technical series, “Scaling Modern Software Delivery” for more perspectives from Kelly Looney around cloud, DevOps, containerization, and more!