Software Testing & Confidence

When we build software it can sometimes feel like a precarious balancing act. Will this next bit of code do what we expect? Will something we did not change begin behaving strangely? Certainty and confidence are not always feelings close at hand for many software developers.

Being able to answer these concerns with greater confidence is why software tests are so important. For over a decade now I have been a consistent practitioner of Test-Driven Development. It has made me a better software craftsman and helped me move from programmer, to developer, and now to the point where I consider myself a software engineer. And, as new paradigms of software architecture continue to emerge and evolve the confidence I gain from good tests will remain important to my work.

Layers of Testing

Over the last several months I’ve been working on a new microservice at Nav. It is intended to supplant an existing service and so it has multiple criteria it has to meet. It needs to provide a certain degree of parity with the existing service. It needs to add additional value based on its own functionality. And, it must support a migration path for both data in the existing service, as well as for other services within our ecosystem.

The first two of those criteria are easily handled by service-level unit tests. I have long made a habit of starting with 100% C0 code coverage and not allowing that to slip. But, that only gives me confidence that my code works as intended, and specifically as I have described that intent in my tests. And, were this a new service, not meant to replace anything else, then those tests would give me sufficient confidence to release the service within our ecosystem. But, the third criteria I described above is the concerning one. Because that criteria relates to interacting with other systems, another layer of testing is necessary.

Unfortunately this layer of inter-service testing is the more challenging, but necessary. I need to gain confidence for the communication between the new and old service for current activity. There will be an ETL stage to bring over old data, which will need to be manually confirmed. And, finally there will be a need to confirm that services relying on the old service can interact with the new service instead.

Externalities Add Difficulty

For my work, we have a messaging system to enable asynchronous communications between the new and old services. I have tests that mock the messaging system and ensure we are attempting to do the right thing. But, for both services I need to increase my confidence that messages are actually being sent properly. This means that I need to test the messages that are sent and the action of the service receiving those messages.

The first step was to add tests to confirm the right messages are passed to the messaging system. These tests took some careful work because adding a dependency on the messaging system meant latency was introduced. I had to build wait time into the tests which made them slower, but only marginally so. Thankfully we use Docker on our development machines so setting up the messaging system for local testing was incredibly easy.

After I established tests to confirm the right messages were getting to the messaging system, I moved on to ensuring that the right actions occurred on the services. This introduced another dependency that required more careful testing. But, I could assert that with both services running, when an event happened on one it triggered the appropriate action on the other. I wrote tests to confirm this in both directions.

With these tests I have confidence that these services interact correctly. And, that was the goal. There are other details that needed to be managed. Ensuring proper expectations in regard to the messaging system was important. Knowing how the system would behave if there was a temporary inability for one service to process inbound messages. And, each of those were about improving confidence. And, whenever possible, that confidence is backed up with automated tests.

The Unacceptable Alternative

Now, this may all sound like a lot of work, and it is. But it is absolutely essential. Code without tests is defective. And, in a microservice ecosystem this leads to the assertion that collaborating services without tests are defective. If two services have explicit dependency on one another, but their actual interactions are never tested, then there is no reproducible proof that they work together.

Code without tests is defective, and so are collaborating services without tests.

The alternative is a lack of confidence that the system behaves as expected. The alternative is uncertainty. And, the reality is that hope is not a good plan.

Testing is an incredible practice for software development and will make the developer and their software better. Noel Rappin has a great talk available from RubyConf 2017 on High Cost Tests and High Value Tests. For those interested in more details on testing microservices, I recommend this talk from the 2017 O’Reilly Software Architecture Conference in London called Reality is Overrated: API simulation for microservice testing, part one and part two, which is available via Safari Books Online.

Comments are closed, but trackbacks and pingbacks are open.