The empirical value of unit tests
Despite the fact that many experts do everything they can to emphasize it, there are still many projects around the world that lack even a single unit test. In my professional career, I have repeatedly encountered developers who question the value of unit tests. Meanwhile, according to the computer science research from 2014, "a majority of the production failures (77%) can be reproduced by a unit test." 77%!
The same research figures that "74% of the failures are deterministic - they are guaranteed to manifest given the right input event sequences." No timing relationship. And "among the non-deterministic failures, 53% have timing constraints only on the input events." Given both, it is relatively simple to reproduce the problems using the same operation sequence, even when timing control is required in some cases.
What’s even more interesting, "almost all catastrophic failures (92%) are the result of incorrect handling of non-fatal errors explicitly signaled in software." The error handler over-catches an exception and terminates the system, ignores explicit errors, or contains the words "TODO" or "FIXME". These faults account for 35% of catastrophic failures! You may ask what is a catastrophic failure? A catastrophic failure prevents all or most users from accessing the system.
Researchers looked into 198 failures on Cassandra, HBase, Hadoop Distributed File System (HDFS), Hadoop MapReduce, and Redis that were reported by users at random. They concentrated on distributed, data-intensive systems that are used often and are regarded as being of production grade since they serve as the foundation for many internet software applications. Having said that, the results might not extend to other types of distributed systems than those that are data-intensive or to systems that are at an earlier stage of development. For the 198 random samples, the Central Limit Theorem predicts a 6.9% margin of error at the 95% confidence level.