Is it ever a good idea to re-write from scratch?
Joel Spolsky wrote many years ago that a re-write from scratch was the single worst strategic mistake a company could make.
He pointed to a number of notable failures that were re-writes from scratch. I believe his most salient points are
He pointed to a number of notable failures that were re-writes from scratch. I believe his most salient points are
- You will have devoted a very large amount of development time to a product which will essentially have the same value as the product you are currently selling.
- You will not have resource to improve the features of your existing product and your competitors will overtake you.
- Even if you develop using all the best practices available such as TDD or BDD, have a wonderful QA team a live field tested product is actually likely to be more robust.
In essence he implies that refactoring legacy code is always the best path to adopt. The evidence he provides albeit from pre-2000 is somewhat compelling, and in fact some very successful companies have stuck to his mantra of refactoring sometimes in unique ways. By refactoring I mean the process of adjusting through merger and separation of existing classes, of relatively minor reorganisation of code (i.e. no substantial architectural changes) and adjusting to follow some form of coding standard to improve readability and maintainability of the system.
PHP is famous for its rapid development ability and for being easy to learn. It has also be derided for many years for its lack of full object support, and as a non-compiled language for its performance. Facebook was developed in PHP and obviously while the numbers of users were small feature development and rapid changes were extremely desirable. However, with such a massive user base it becomes clear that performance becomes a significant cost. Rather than re-write from scratch they took a different approach to the problem and developed HipHop which was at first a C++ compiler for PHP, but then evolved into a JIT compiler. This allowed them to keep all of their production ready & tested code, but greatly reduce their operating costs. While I do not have access to the figures involved I would be surprised if anyone would suggest that this approach was not substantially cheaper than re-writing their code base in a compiled language and re-hiring / training their workforce in the new language they intended to use.
The "never re-write" mantra is a little depressing for a developer who works on a legacy project with spaghetti code greeting them on a daily basis. If the developer has worked with the legacy code for a long enough time then they are not re-writing from scratch, but re-writing with years of business logic experience, knowledge of failed pathways and ideas of how the system architecture can be improved.
Refactoring is always quoted as the best way to deal with the spaghetti code, however, with a large code-base where do you start? How aggressively do you refactor, and if you are in the end having to refactor an entire code base are you really saving time when compared to reworking from the model upwards?
Additionally refactoring does not help if you are:
- using a dead/dying language
- using unsupported technologies software or hardware
- specialist systems with few specialist staff capable of filling roles
- when 3rd party licencing costs become a substantial proportion of your budget
If you are encountering those problems then something more aggressive than refactoring is can be required. While a rewrite from scratch, 12 months of intensive coding developing a new "perfect" design, and building a platform from which you are sure you can reliably develop on rapidly may sound great it is almost certainly not the best approach. Your sales staff won't be happy knowing that they cannot sell anything new for a year, and even when it arrives it will only do the same thing the last one did. The risk is huge and you are also essentially giving a 1 year head start to your competitors, and that is assuming your rewrite does not end up overrunning.
There are two approaches that I believe allow you to adopt your glorious rewrite and yet also deliver added business value to your products.
New Minimum Viable Product
Trying to match a several year old legacy product feature for feature is likely to take a very long time before you can ship a release. Additionally can also lead to a compromised design where you match the feature in its interaction design rather than applying a considered approach which could improve the functionality.
So why not use this new code base as a basic product offering. Instead of matching the product feature for feature develop a "basic" version which has the core functionality you need to actually sell and then sell it as the cheaper version of your all singing and dancing existing product.
This gives you a revenue stream relatively quickly, your new product gets field tested much earlier, and you have the base platform to build on until you can match and exceed your existing legacy product.
Modular Plug-in
The modular approach is to replace smaller chunks of existing functionality with your new system. Plug in your new architecture side-by-side with your existing architecture and re-route in a modular fashion your existing code through new pathways. There are several ways to achieve this, but essentially you are taking existing calls perhaps putting a new routing method at the front and then sending your calls down the existing route and the new route.
This approach allows for one major benefit risk reduction. If your new routes encounter a problem then you should be able to switch back to the legacy routes straight away and mitigate the impact. You can target areas that need new functionality first or that are a nightmare to maintain to achieve the greatest impact available for your changes.
There are however 4 major disadvantages of this approach
- For periods of time before the legacy code is retired have essentially doubled the complexity of the path and object progresses. It is vital to plan in a way to minimise this period of time or the sanity of your support staff will be compromised.
- Replacing in a modular fashion makes it difficult to address fundamental architecture problems. It is certainly tempting to maintain the same structure and just swap out components, but if one of the issues is an unduly complex architecture then this approach will struggle to fix that.
- If you are switching language or technology and lack significant experience in it, then modular replacement may lead to significant problems. Your new system may not work well with your existing architecture, it may have its own idiosyncrasies which end up reducing the stability of the system rather than improving it.
- Without a strong enough impetuous your legacy product may linger around for a very long time. The main problem with this is the support cost of the legacy system. This cost is often obfuscated at a management level and difficult to place a monetary value on. Without a monetary value it becomes difficult to make a business case for the change, hence there is commonly a lack of impetuous to change.
The modular approach's costs can often be minimised by combining the approach with the minimum viable product methodology. If you make your modules too small then you are likely to exacerbate the above costs, but it is tempting to do so to deliver quickly. Instead it can be better to look to replace a larger chunk of functionality with a simpler solution, build the minimum viable version of that and re-route as much as possible through that minimum viable version. You will hopefully have a big support cost saving with your initial change, and potentially performance / stability benefits as well. Build up the functionality over time and then retire a large chunk of your legacy product in one go.
Comments