Cybersecurity vendor CrowdStrike caused a global IT outage a week ago that wasn’t related to hackers or malware; in other words, security. And it wasn’t a “cyber” anything—it was real-world chaos, choking the operations of hospitals, banks, airlines and even broadcasters. Tallying all the pain it caused—in some cases, we expect, closed businesses and lost jobs, and maybe worse—will take months, perhaps years.
Unfortunately, CFOs can’t take that long to lower the probability of this happening again, whether at organizations affected by the outage or those that were just rubberneckers to the massive IT break.
“Because the CrowdStrike failure shut down business operations, the results will land right on the CFO’s desk,” says Jon Winsett, CEO of NPI, an IT procurement solutions provider that had large enterprise clients affected.
For the businesses whose Windows machines went blue screen, it could mean revenue and brand integrity loss. Winsett says the incident was a stark reminder that even well-resourced, trusted software vendors can inject vulnerabilities into company tech stacks.
“Buckets” of Lost Trust
But this operational risk will be “tricky to mitigate going forward,” says Winsett: “Seemingly innocent automatic updates are happening on our devices throughout the stack from the operating system to the keyboard every day—halting those updates introduces other, equally perilous consequences.”
While CrowdStrike promised to improve the testing of software updates and stagger future update sends (“canary deployments,” they’re called), customers would be unwise to trust that will happen. “The confidence we built in drips over the years was lost in buckets within hours,” wrote Shawn Henry, CrowdStrike’s chief security officer, on LinkedIn.
In general, companies must reassess their IT systems’ potential points of failure and consider the need for greater redundancies and quality control across their networks. The particular measures companies can take to safeguard against another CrowdStrike-like episode include bolstering response and recovery plans, analyzing IT vendor concentration risk and revising agreements with software vendors.
Handle software updates with care. To err on the side of caution, NPI’s clients are focusing on moving to staged software updates. That requires “determining which technologies should be subject to a ‘walk, then run’ methodology where updates are not automatic but are rolled out to a specific cohort or two first as a precautionary test before full-scale distribution,” Winsett says. Data security, GRC and vendor risk consultant Craig Callé, CEO of Source Callé, says “This is a bit tongue in cheek, but service-level agreements (SLAs) need to indicate that untested, kernel-level software updates should not be pushed globally, at one time, especially on a Friday.”
Refresh recovery plans. Refresh and rehearse response and recovery procedures for non-breach scenarios, Winsett says. Plans for restoring systems after an IT outage can include “back-out” procedures specifically for software updates that don’t go as planned, according to a July 19 Forrester Research advisory. The procedures return the system to a known, good state. “CFOs should demand particular focus on revenue-centric systems,” Winsett says.
Hold vendors’ feet to the fire. Software contracts can be used as a risk mitigation tool. CrowdStrike offers a warranty if a customer suffers a security breach. CFOs should consider asking for “business interruption indemnification clauses from any vendor in the event of a software update gone awry,” according to Forrester. “Maybe this teaches us that we need to have a greater focus on damages in software contracts,” Callé says, to reimburse the customer for lost business. “Money talks. When you hold the vendor accountable for real damages, they will be more cautious,” he says.
Re-examine third-party risks. Tech teams need to map a company’s third-party ecosystem to identify significant concentration risk among vendors, says Forrester, especially those vendors that support critical systems or processes. In addition, “incident response and business continuity are important parts of a third-party risk [management] program,” Callé says. “TPRM should be about more than just sending and receiving questionnaires.”
Overspending Alert
Proceed with care on the above items, however—the potential to drive significant, unplanned IT spending, says Winsett. “Some vendors will smell blood in the water after this fiasco. CFOs would be wise to have a strategy to protect against overspending as they wade through the implications.”
A version of this story appearerd in StrategicCFO360’s Finance & Accounting Technology Briefing.