|Power Cycle Your Boeing 787 To Keep It Flying|
|Written by Mike James|
|Sunday, 05 April 2020|
Boeing has so many bigger problems that this one could go unnoticed, but it is of special interest to us programmers. The FAA has issued an order that 787s have to be switched off and on every 51 days.
The directive doesn't give any real clue to what might be wrong, but that 51 days is a little strange as numbers go. A quick sum reveals that there are 73400 seconds in 51 complete days, which is suspiciously close to 64536 the largest number a 16-bit int can represent. Unfortunately my guess didn't work out, as a 16-bit second counter rolls over in 45 days, so a recommendation to reboot in 51 days wouldn't really help.
After this lesson in how to work out the possible rollover, I resorted to the most sophisticated programming tool on the planet - a spreadsheet! Calculating the rollover for different units of time quickly revealed that 42-bit counter running at 1MHz rolled over at 50.9 days. The 42-bit part is a little unusual, but there are 42-bit hardware counters in a number of chips and you could result from using part of a larger register.
The directive doesn't go into much detail but does say:
"The FAA has received a report indicating that the stale-data monitoring function of CCS may be lost when continuously powered on for 51 days. This could lead to undetected or unannunciated loss of CDN message age validation, combined with a CDN switch failure. The CDN handles all the flight-critical data (including airspeed, altitude, attitude, and engine operation), and several potentially catastrophic failure scenarios can result from this situation. Potential consequences include:
It sounds as if the time stamp on the data rolls over and old data is displayed instead of new data.
Of course, this is just a guess but I wouldn't be surprised as rollover is still the biggest cause of this sort of error and it is typical that a reboot solves the problem. Also, this isn't the first time this has happened in flight software and we have a report of an earlier incident, see Reboot Your Dreamliner Every 248 Days To Avoid Integer Overflow.
What is more worrying is that this is surely among the most safety-critical software we create and it seems that we still can't avoid such mistakes.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Sunday, 05 April 2020 )|