Understanding black swan events: Why an anticipatory risk management approach is the antidote

On Friday 19 July 2024, the world encountered the biggest IT outage in history when major cybersecurity company CrowdStrike released a faulty update that caused an estimated 8 million computers worldwide to crash. The event had a wide-scale impact on airline ticketing systems, point-of-sale systems and any organisation that had the CrowdStrike Falcon solution installed across its IT environment. Reports by major insurance companies estimated that the crash caused more than $5 billion in direct damages to businesses, with the healthcare and banking sectors being the hardest hit, suffering over $1 billion in damages. 

13 Aug 2024 4 min read Dispute Resolution Alert Article

At a glance

  • The recent CrowdStrike event that affected many businesses exposed a vulnerability and emphasised the need for effective risk management for businesses.
  • The key principle businesses have to incorporate into their risk strategies is that they must not react to disruptions on an ad hoc basis, they also need to anticipate them.
  • Practically, this anticipatory approach consists of four major processes: bespoke risk management and contingency plans, risk containment, having a disaster management team in place, and an effective aftermath management plan.

An internal investigation report released by CrowdStrike revealed that the crash was caused by an update rolled out to its signature cybersecurity software Falcon, which caused a fatal error at a kernel level causing computers running Windows to encounter what is commonly known as the ‘blue screen of death’.

The CrowdStrike event is characterised as a “black swan event”. The term stems from a Latin expression which presumed black swans did not exist, until Dutch mariners discovered them in Australia. The term has since been used as a metaphor to describe unforeseen and highly consequential events. While the CrowdStrike event happened an IT level, it raises as many risk management questions as it does IT questions.

Reimagining the way risk is perceived

Risk is inevitable in business operations. It presents itself through external factors in a company’s environment (such as the CrowdStrike incident or the COVID-19 pandemic) and internally (like internal leaks or flawed executive decision-making).

The CrowdStrike event exposed a vulnerability in companies’ IT update procedures in that IT systems were too dependent on automated updates bypassing standard testing procedures. In simple terms, had the CrowdStrike software update been tested in isolation on a single machine, before being rolled out automatically across all systems, the software bug would have been detected before being automatically applied across a particular enterprise’s IT environment. A potential reason that many companies elect not to employ this practice (apart from the obvious cost factor), is that they lose the benefit of having immediate up-to-date software installed across their IT environment. Fast-paced businesses that make high-level operational decisions are constantly faced with weighing up the benefits of employing a certain practice versus the risk that the decision may carry.

A common phrase that comes to mind is ‘high risk, high reward’, and while risk is inevitable at every level of business operations, the approach we advocate for is to reimagine the way risk is perceived by anticipating risk and not merely responding to it once disaster strikes.

By adopting this anticipatory approach, businesses are well placed to handle any black swans that fly their way. Practically, the anticipatory approach consists of four major processes that we breakdown below.

1. Bespoke risk management and contingency plans

Risk comes in the form of industry-specific risk and operational-specific risk, which is exposure that is business specific. A broad risk management strategy that does not cater for the intricacies of its company’s operations will miss material elements of exposure, and the business will not be able to adequately mitigate the full extent of the crisis.

2. Risk containment

Effective risk containment measures are key for helping to mitigate damages caused by an adverse event or disruption. Effective risk containment plans would have to be tailored per level and sector of operation within and across a business. We referenced the updating procedure of organisations in the CrowdStrike incident above. An effective risk containment plan in this respect would have been able to stop the update rolling out to other machines in an organisation as soon as it was detected as faulty.

3. Assembly of decision makers/disaster management team

Time is of the essence in the face of a black swan event. Since these events typically affect material operations of a business, it is necessary for senior and executive level decision makers to be assembled in order to act fast and make swift decisions to enable the business to navigate through the event. It is thus important to firstly identify who these decision makers would be, grouping them into a ‘disaster operations’ team and, secondly, ensure that the disaster operations team would be the first to know about the adverse event. In this respect, we recommend setting up dedicated communication channels that would be used exclusively for purposes of assembling the team. We emphasise the importance of having a communication plan that would be able to notify the relevant decision makers, even in the event of total system shutdown. It is very important for a business to be risk ready, which means that a simulated exercise of, for example, a ransomware attack or other type of disaster, should be part of preparing the disaster operations team’s response to such event, with each member of the team’s role and response being tested for efficacy and robustness.

4. Effective aftermath management plan

The aftermath of a disruption is important to consider as the fallout can often be more severe than the actual disruption. In this regard, we recommend having an adequate PR response plan, which has policies on how relevant stakeholders and affected parties should be addressed. We also recommend having a list of regulatory authorities that would have to be addressed, based on the type of incident encountered. Having templates for correspondence would be particularly helpful and would lessen the workload by having the necessary notices already drafted. Understanding a business’ insurance arrangements and the role insurers would play is very important, particularly in a malware attack where ransom is demanded.

In summary, the key principle businesses have to incorporate into their risk strategies is that they must not react to disruptions on an ad hoc basis, they also need to anticipate them.

The information and material published on this website is provided for general purposes only and does not constitute legal advice. We make every effort to ensure that the content is updated regularly and to offer the most current and accurate information. Please consult one of our lawyers on any specific legal problem or matter. We accept no responsibility for any loss or damage, whether direct or consequential, which may arise from reliance on the information contained in these pages. Please refer to our full terms and conditions. Copyright © 2024 Cliffe Dekker Hofmeyr. All rights reserved. For permission to reproduce an article or publication, please contact us cliffedekkerhofmeyr@cdhlegal.com.