Spooky Software Failures
Software is pretty amazing. Without software, a lot of problems could never be solved. Without software, we couldn’t do a lot of things on massive scales. But software has its problems. As a whole, we as an industry don’t have a great track record of preventing these problems.
One type of software we would all assume to be rock solid is banking software, but banking software has had some classic problems. The first I want to include is TSB banking in Europe. After doing a major update, customers were able to see other customers accounts for about a week. Second is Wells Fargo. Many of Wells’ problems seem like they are self-inflicted, but this one is a classic software bug. About 625 Wells Fargo customers were incorrectly blocked from making changes to their loans.
Spooky software problems aren’t relegated to banking software. The really big names in software play too. One of the spookiest, and one that affected me personally is the Azure Leap Day outage. Microsoft’s Azure Cloud went down on leap day, and I am sure that every programmer thinks the same thing: dates are hard. Amazon wouldn’t let Microsoft play alone with bad software problems. They had a nice outage caused by a maintenance boo boo.
Companies aren’t the only ones that like to play with bad software. One of the most famous examples from the US government is the Air Force’s learning experience trying to build an ERP system. After spending $1 billion in learning, they decided that was enough learning and gave up. I guess $1 billion is better than $2 billion.
My favorite spooky software moment occurred in 2015 when many systems failed all at once. United Airlines grounded its fleet. Seattle’s 911 system went down. Trading was suspended on the New York Stock exchange was down. The home page of the Wall Street Journal was down. All of this sounded like a coordinated attack, but it wasn’t. It was just bad software.
No software is perfect, but as we rely more and more on software, we will experience more scary software moments. We all need to take it upon ourselves to try and improve the software we are part of.