My setup process for dealing with legacy code

By Steve Claridge on 2017-08-24.

When I work on an older/legacy/problematic codebase I have a number things I try and do to make sure that changes, bug fixes and additions to the code go smoothly. By "older" I mean software that has been around for 2-3 years or more, has a number of different developers work on it, has had many functionality and requirement changes applied to it and is using technology that may have been deprecated.

Companies with problematic in-house software and code often find it difficult to maintain them. An aging system can become problematic to update or fix bugs in for a number of reasons:

  • The software requirements were not properly documented or maintained so there's no official record of what it should do.
  • Developer(s) who worked on the system have left, taking knowledge with them. So there is large parts of the code that no-one understands or dare touch.
  • There is no testing regime for releases. Automated or documented user acceptance tests go a long way when trying to work out what a system should do.
  • There are no unit tests. Code with a thorough set of unit tests is more easily changeable as you have a baseline to test changes against.
  • Code isn't documented and has no source comments. Some companies ask their developers to create specifications of their work units, some don't, but all should be making sure comments are added in to code to aid readability and understanding. It is much easier to write new code than it is to read and understand it later.
  • Many developers get their hands on the system over the years, each have their own coding style, toolbox and ideas - the result being that the codebase loses its coherence and starts to look like a bowl of spaghetti.

Different clients obviously have their own setups and requirements but in a perfect world I try and to put all the things below in place when starting work on a project so that I can combat the above problems and turn a difficult-to-update system in to something more manageable:

Source control - Before touching a line of code I want to make sure that the entire codebase is under version control. This could be in an external repository like GitHub or Bitbucket, or something in-house on one of your servers but there absolutely must be a single central version control repository holding all the code. This means we can be sure that all code needed to build the software is in one place, it also means we can easily rollback changes (if a code-change breaks something) and see a log of what was changed and when.

Understand the users - If possible I like to sit down with the users of the system to learn how they use it. Typically with any medium to large-sized software product there are a number of different people using it, for different reasons and in different ways - so I often see the situation where no one person knows all of the use cases. It's good to get at least a basic understanding of how people use the system before making changes - it is also good to know the users as they can do acceptance testing on future releases.

Understand the stack - I try to be as thorough as possible when determining the deployment setup of the system. Where is the code deployed to? By who? What server(s) does it run on? etc etc. Code and deployment are usually closely coupled in older systems. These days, with technology like Docker there is a larger separation and code is designed and written to run in small containers so that it can be deployed anywhere. With legacy systems I find the code often relies on being run in certain environments so it is good to have as much detail about that as possible.

Dev/Test/Live - In an ideal world any software development process will have three environments: a dev environment where developers can combine their work and test that it works together; a test environment where user acceptance testing can be performed; a live environment, this is where tested versions are promoted to from the test environment, this is your live-to-the-world software. If you don't already have separated environments for testing code changes before they go live to the public then I will badger you relentlessly until you do. Writing code and putting it straight in front of your customers is a recipe for disaster.

Developer setup and builds - A programmer working on a particular system should have a development environment set up that is as close to the production environment as is reasonably possible. With newer environments like Docker this is not so much of a consideration, but with older setups can often be very important - so when I work on an old system I try and have my local setup using all of the same software and preferably the same OS as production. E.g. if your website runs on Linux and JBoss, I will try and build on Linux and JBoss too.

Ideally a development team would be using continuous integration to build their source changes often and run tests against them in your dev environment. When there is only one developer this is less of a concern.

Horses For Courses

Every company has their own in-house development set ups and challenges, so the above is not always feasible. In an ideal situation, I would do all of the above to make sure that your older software becomes stable, is easier to maintain and it becomes quicker to make and release bug-fixes and new features for it but I work within the constraints of each environment to get trustworthy code changes into production as quickly as possible.

Need Help?

Do you have in-house software that is difficult to maintain? Does the thought of making changes to your codebase wake you up screaming at 3am in a cold sweaty panic?
Get in touch to see how I can help you