The Biggest Embedded Software Issue Is …
There are many different problems and challenges that embedded software developers are facing today. One of the biggest, and least spoken about issues that I have encountered is that developers are writing their software for success. Writing for success sounds great, except that what I mean is that developers are writing their software assuming that nothing will ever go wrong! What they are writing is functional prototype code that executes in a controlled, lab environment without issues. Don’t believe me? Let’s look at a publicly available example that I’ve recently encountered before discussing the failure mindset developers should be adopting.
Writing Software for Success
The example I have in mind, that I see time and time again, is some code that I came across in an I2C driver generated by a toolchain that is supposed to be designed for safety critical applications. The I2C driver is generated by the “safety” tool and is supposed to be production ready. The developer only needs to call the driver with the necessary address, read/write bit and the data and everything is supposed to be okay. There is of course a problem. If a device being written to fails to ack and instead NAK’s, the code gets stuck in an infinite loop as shown below:
[snippet slug=safetyfirmware lang=c_cpp]
A device could NAK for several reasons such as:
- Invalid command received
- The device wasn’t ready
- Device error
- Improper address
- etc
But this supposedly safe driver doesn’t allow for a device to NAK unexpectedly! Instead, the driver will hang-up in this while loop, stuck in an infinite loop. If a slave device were to fail or the bus were to go down, the entire microcontroller application would hang-up because the driver would be expecting a response that would never come!
(On a side note, what I think is even worse about this code is that it was identified as a potential issue, and someone approved that it was okay to ship like this!)
Writing Software for Failures
Writing software to handle failures requires developers to change the way that they think about software. Instead of being focused on “making it work”, developers need to adopt a “make it fail” or “what can fail” approach. In this mindset, a developer constantly is asking himself with every line of code, “What can go wrong?”. This can result in identifying issues such as:
- Potential infinite loops
- Hardware errors
- Communication protocol response issues
- Taking a wrong code branch
- etc
With a potential issue identified, the developer can then take action in the software to detect the issue and then handle it effectively.
Conclusions
Too many developers are writing software for success without any consideration being given what can go wrong. They are assuming that everything will work fine in the field just like it did on the lab bench. The result isn’t just software that has a lower grade of quality but software that could be more expensive to develop and late to market when rework is considered to handle failures that are later discovered in the field.
Struggling to keep your development skills up to date or facing outdated processes that slow down your team, raise costs, and impact product quality?
Here are 4 ways I can help you:
- Embedded Software Academy: Enhance your skills, streamline your processes, and elevate your architecture. Join my academy for on-demand, hands-on workshops and cutting-edge development resources designed to transform your career and keep you ahead of the curve.
- Consulting Services: Get personalized, expert guidance to streamline your development processes, boost efficiency, and achieve your project goals faster. Partner with us to unlock your team's full potential and drive innovation, ensuring your projects success.
- Team Training and Development: Empower your team with the latest best practices in embedded software. Our expert-led training sessions will equip your team with the skills and knowledge to excel, innovate, and drive your projects to success.
- Customized Design Solutions: Get design and development assistance to enhance efficiency, ensure robust testing, and streamline your development pipeline, driving your projects success.
Take action today to upgrade your skills, optimize your team, and achieve success.
It’s hard and tedious to check every function return value, and check every pointer passed into a function for null, and check every parameter for range/legality, etc, but we code at our own peril if we are not rigorous in doing these (and many others).
Coding standards and code reviews are a good way to enforce the issue…
I have to agree that this is a major problem. I ran into a similar problem recently, but slightly worse. Not only did the code get stuck in an infinite loop because of a similar ACK/NAK problem, but it compounded the problem by blindly writing the status into a six byte buffer and incrementing the pointer to the buffer with each loop through the code. The result was that it clobbered the RAM in addition to getting stuck.