Some lessons to learn from the CrowdStrike debacle
Posted by jpluimers on 2024/08/20
About a month from International CrowdStruck Day, just a few thoughts, more likely to follow:
- How well does your infrastructure behave when none of your Windows machines can boot?
- How well is your out-of-band management?
- How well is your CMDB doing key management, for instance for BitLocker encryption?
- Is checkbox compliance more important than a single point of failure?
- Can you ensure all updates from your supply chain are staggered/staged/phased with a kill switch when things get out of hand?
- Are the worst case scenarios in your disaster recovery plans really the worst?
- Do you understand the human factor of large scale outages (both of the people that – often indirectly – triggered them – hello #HupOps – and the ones that cannot work because of them)?
- Do you value your people – especially the ones that pulled you out of this situation – enough, and did you rename your Human Resource department into something that is more friendly to your people?
- Do you realise this could have happened on any of the platforms you use, including Linux and MacOS?
- If you were mentioned in the media by not recovering well, do you have any idea how much a target you will be from adversaries?
- Did CrowdStrike finally show some real postmortem instead of the half-hearted communications they did mostly after the weekend following the debacle?
- How does your organisation perform dates of critical files?
- Would other platforms be less or more risky? If so: why?
- Will eBPF solve most of this, or at least centralise the issues and what consequences would that have?
- [Wayback/Archive] Tony Arcieri 🌹🦀: “To fix the bungled CrowdStrike…” – mas.to
To fix the bungled CrowdStrike update, apparently you need to boot the system into safe mode and remove a file.
If the system was encrypted with Bitlocker, you need to enter the system’s Bitlocker recovery key.
Apparently, many people are discovering they didn’t have key management in place to store Bitlocker recovery keys, making it akin to a self-inflicted ransomware attack.
- [Wayback/Archive] So CrowdStrike is deployed as third party software into the critical path of mis… | Hacker News
So CrowdStrike is deployed as third party software into the critical path of mission critical systems and then left to update itself. It’s easy to blame CrowdStrike but that seems too easy on both the orgs that do this but also the upstream forces that compel them to do it.
My org which does mission critical healthcare just deployed ZScaler on every computer which is now in the critical path of every computer starting up and then in the critical path of every network connection the computer makes. The risk of ZScaler being a central point of failure is not considered. But – the risk of failing the compliance checkbox it satisfies is paramount.All over the place I’m seeing checkbox compliance being prioritised above actual real risks from how the compliance is implemented. Orgs are doing this because they are more scared of failing an audit than they are of the consequences failure of the underlying systems the audits are supposed to be protecting. So we need to hold regulatory bodies accountable as well – when they frame regulation such that organisations are cornered into this they get to be part of the culpability here too. - [Wayback/Archive] How One Bad CrowdStrike Update Crashed the World’s Computers | WIRED
- [Wayback/Archive] e. hashman 🇵🇸: “I would have thought at Crowds…” – Cloud Island
I would have thought at Crowdstrike’s scale that SURELY they use a slow-rollout/canary model for global updates. But the scale of this outage suggests otherwise. There’s no way the rollout should have continued with 100% of clients not checking in.
- [Wayback/Archive] Dave Anderson: “Okay so you know, bunch of shi…” – Hachyderm.io
Okay so you know, bunch of shitposting and all that, but a serious interlude:Someone pushed the button to start this rollout. They are probably having a _really_ bad time right now.If someone at Crowdstrike knows who that is, please go and check on them, give them a hug, tell them it’s not their fault, that it’s going to be okay. No matter what the company line is on blameless culture whatever, the lizard brain is in charge right now and needs reassurance. - [Wayback/Archive] Gene Kim on X: “Love it! “LinkedIn: Sascha Bates added a skill: #HugOps” @sascha_d” (April 2013) was the oldest reference I could find about #HugOps which [Wayback/Archive] Jordan Sissel popularised at PuppetConf 2023 ([Wayback/Archive] GitHub – chriseckhardt/puppetconf_2013: My Personal Notes from Puppet Conf 2013 Sessions; more on the term: [Wayback/Archive] #HugOps in Practice: Empathy Skills for DevOps | PagerDuty)
- [Wayback/Archive] John Regehr: “I often tell people that secur…” – Mastodon
I often tell people that security software is sort of uniquely horrible since it runs at maximum privilege level while being developed by B- or C- software shops. today isn’t a counterexample.
- [Wayback/Archive] JdeBP: “All of the Linux people being …” – Mastodon App UK
…
In another universe where Linux systems were instead deployed in the many businesses/public services/governments with CloudStrike as the common anti-malware choice, today would have been Linux panic day.
…
- [Wayback/Archive] cynicalsecurity :cm_2:: “A #CrowdSstrike offensive summary (update): * we…” – BSD Network
A #CrowdSstrike offensive summary (update):- we know Flacon updates are not verified prior to being enabled
- we know that they don’t do staged updates
- we know a lot of large customer names
- we know the DR plans (or lack thereof) of said large customers
- we know the systemic reactivity
Learned opinion: it does not look good.For those involved with the darker side of cybersecurity this is a monstrously useful set of data points. - [Wayback/Archive] Kris: “Guten Morgen. Es ist Montag, …” – chaos.social
Guten Morgen.
Es ist Montag, der 22. Juli 2024, Tag 4 nach dem Crowdstroke.
Die Firma hat ein Statement mit Technical Details veröffentlicht, www.crowdstrike.com/blog/falcon-up…date-for-windows-hosts-technical-details/
Das ist größtenteils ein Nothingburger. Es wird gesagt, was wir schon wussten. Inhaltlich wird nichts über die Prozesse und Fehlerursachen gesagt.
Wie @masek sagt, infosec.exchange/@masek/112817…758224618946
Man vergleiche das mit XZ. Oder den Schaden mit NotPetya.
Der Killcount ist hier, cyberplace.social/@GossiTheDog/112819549722486621 - [Wayback/Archive] Stefan Eissing: “How Apache ACME (mod_md) gets …” – chaos.social
How Apache ACME (mod_md) gets you a new certificate:- all ACME communication are done as unprivileged user
- all certificates from the CA are parsed as unpriviledged user before storing them
- activation, as priviledged user, parses again before replacing production as a last fail safe.
Since 2017.Typical over-engineering. What are the chances a CA sends you a borked file? - [Wayback/Archive] BBC Micro Bot :mastodon:: “I ran @RLabibov’s program and got this …” – mastodon.me.uk
r_labibov @RLabibov
@bbcmicrobot
#bbcmicrobotMO.2:F=1:Y=512:REP.:N=(N MOD4)+1:GC.0,N:MOVE0,Y:DR.1280,Y:Y=Y-F:F=F*1.2:U.Y<0
GC.0,5:F.I=-8.5TO8.5:MOVE640+I*75,512:DR.640+I*220,0:N.:MOVE0,512:DR.1280,512
C.6:P. TAB(3,8);"NO CROWDSTRIKE":P. TAB(2,10);"BSOD IF YOU STILL":P. TAB(5,12);"USE 6502";
V.19,N,0,0,0,0:N=(N MOD4)+1:V.19,N,5,0,0,0:I=INKEY(12):G.30
I ran @RLabibov’s program and got this.
Source: bbcmic.ro/?t=aidur #bbcbasic[Wayback/Archive] mastodon.me.uk/system/media_attachments/files/112/820/219/138/047/256/original/737165f916efb266.mp4
[Wayback/Archive] BBC Microbot – Owlet Editor

[Wayback] archive.is/zLsz1/10a08a86d483c818d63392550b3c2170a242f533/scr.png
- eBPF
- [Wayback/Archive] Brendan Gregg: “New post: No More Blue Fridays…” – Aus.Social
New post: No More Blue Fridays: How eBPF is already being adopted to prevent kernel crashes www.brendangregg.com/blog/2024-07-22/no-more-blue-fridays.html
- [Wayback/Archive] What is eBPF? An Introduction and Deep Dive into the eBPF Technology
eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules.
- [Wayback/Archive] Brendan Gregg: “New post: No More Blue Fridays…” – Aus.Social
--jeroen






Leave a comment