“Oh… a showstopper bug was uncovered. We can’t release it! Go home and come back tomorrow!” was a refrain I heard many times early on. I had test subjects ready to go for validation, and we were waiting on the software and acoustic verification. This would often go late into the night, only to be told yet again that we had a showstopper bug. These verification failures and validation delays were frustrating, not so much from their frequency, but from the failure to learn from them.
What Master Yoda tells Luke Skywalker in “The Last Jedi” is true, as the old cliche goes, you learn more from a failure than you do from succeeding. Another one often heard in silicon valley is “fail and fail fast.” The reason for the latter is once you have a failure, you know what doesn’t work so you don’t spend anymore time or resources on that path. Just as valuable is the first. Especially if you take the time to understand why it failed and learn from it. One example was the water bucket test I conducted to simulate people in a crowded environment for contact tracing. We learned that while we could determine who was in the area, due to the shadow effect of the body, we could not reliably determine how close people were to each other unless the sensors had line of sight. While the test was technically a failure, we learned enough from it to know that what we wanted to do was not viable before we spent more money or time on it.
Other failures can be self-inflicted, yet also valuable. An example of this is failing verification because of poorly written acceptance criteria. One example came from a software composition analysis case I ran into. The acceptance criteria said no vulnerabilities could be present. There were no relevant vulnerabilities present so the tester declared it passed and the reviewer agreed. This of course was a failure in the system because the tests should have called failures with a review of the results as per procedure. This case borders on a systemic problem since the acceptance criteria did not reflect the intended reality and were not updated ahead of the test.
Therefore, to keep it from happening again, I recommended updating the acceptance criteria to reflect that they were only interested in relevant vulnerabilities and stating in the report why the irrelevant vulnerabilities were irrelevant. For example, if the scan gives you a vulnerability that is only present on Windows, yet you only run in an Ubuntu Linux docker container, that vulnerability is not relevant to you. Another example is the scan picking up a cross-site scripting vulnerability related to JavaScript/HTML – used for rendering web pages, yet you only receive and output JSON – often used for communicating with different apps or things in an ecosystem. In that case vulnerability is no longer relevant as you are not generating a web page that is vulnerable to cross-site-scripting.
In this particular case, they learned from the failure of their procedure and tests and are now looking to apply that lesson.
So is failure an option for medical devices? Under limited circumstances, yes. The value of a failure on the bench or in trials is you learn what doesn’t work so you can course correct and try something else. The value of a failure post market, provided nobody gets seriously injured is also ok provided you take the lessons learned from the failure and apply them. That’s part of the reason why auditors and inspection hit your Corrective and Preventive Action procedure as hard as they do. They want to know how you handle failures when they happen, do you learn from them? Those are critical questions to assure the safety and efficacy of your product. So yes, failure is an option. Sometimes you can proceed on it with a review, sometimes you can’t. Therefore, failure is definitely an option – under specific circumstances provided you learn from it!