CrowdStrike crisis shows need for transformation of tech management
Major tech outages have recently impacted customers and operations around the world – from banks to airports. Stephen Johnson, business IT expert and Founder and CEO of quality engineering consultancy, Roq, explains what companies can do to prepare for future outages.
The global IT outage affecting 8.5 million Windows PCs highlighted the vulnerabilities in the computer systems of some of the world’s largest companies and organisations. The failure, triggered by an update of CrowdStrike cybersecurity software, disrupted services and sectors including airlines, online transactions, cash machines, card payments for retailers like Morrisons and many high street banks. Even NHS GP and cancer-treatment appointments were cancelled.
The CrowdStrike crisis is not an isolated incident. Stephen Johnson, CEO and founder of Quality Engineering consultancy, Roq, says it is now imperative for companies and organisations to invest significantly more resources and effort into ensuring robust, future-proof systems underpin everything they do.
"The issues with CrowdStrike have been heightened due to it being a cybersecurity product. However, it didn’t cause a security breach. The real challenge is that operating systems like Windows, security platforms, and sensitive data are akin to the skeletal frame of a human body or the core infrastructure of society. When compromised, it has a devastating effect.”
Founded in 2009, Roq is an outcomes-focused quality engineering consultancy. The firm works with some of the world’s largest organisations on their most important technology initiatives, and aims to help organisations to realise the benefits of high-functioning, high-quality software applications delivered at a pace that aligns with their business imperatives.
According to Johnson, over the years, Roq has witnessed how “a lack of investment in core infrastructure” – including schools, roads, trains, and the NHS – has long-term impacts. Prevention is always better than a cure, though, and certainly cheaper – with taxes increasing to address these issues. Looking ahead, technology infrastructure also has to be considered this way.
“Despite being a substantial part of organisational budgets, technology is similarly feeling the impact of long- term underinvestment,” Johnson notes. "Unfortunately, in a competitive commercial landscape, speed often takes precedence over quality. However, in critical infrastructure like CrowdStrike, neglecting quality can lead to significant repercussions. As AI becomes increasingly prevalent, the associated risks will only grow. You can tell when a piece of code doesn’t work, but how will you know when an algorithm is truly working as it should?
“We’ll probably never know what validation took place at CrowdStrike when someone decided to go live with this code. But the assumption is that something was missed, unplanned for, or a defect was ignored. Each of these possibilities indicates a significant oversight in Quality Assurance, which lead to severe consequences for millions of people across the globe on Friday. Until the quality of technology is seen as a serious risk factor at board level, we’ll continue to encounter these issues.”
Like most significant issues, then, this could be an early symptom of what is to come. But what can organisations do to address those issues as quickly as possible? According to Johnson, there are “plenty of steps to take”.
Some quick steps to improve quality engineering and system robustness include automated systems testing, to identify weaknesses throughout the development cycle. Organisations might also seek to analyse their systemic robustness with stress testing to validate potential failure points for quick fixes.
But some things will take time and commitment. Johnson recommends not just the prioritising of a long-term investment into technology, making sure ongoing upgrades and maintenance are continuously upheld – but to oversee cultural shifts. Valuing quality over speed will “reduce the likelihood of costly errors and outages”, while encouraging collaboration across an organisation can help futureproof a company’s technology systems holistically.
He concludes, “By addressing these areas, organisations can better protect themselves and their customers from the significant risks posed by technology failures. The CrowdStrike incident is a serious reminder of the importance of robust Quality Engineering in our increasingly technology-dependent world.”