Could be worse. I was the only member of my entire team who didn't get stuck in a boot loop, meaning I had to do their work as well as my own... Can't even blame being on Linux as my work computer is Windows 11, I got 'lucky'; I just got a couple of BSODs and the system restarted just fine.
Critical surgery computers may also be running under Windows LTSC, so they might not get the CrowdStrike patch. Maybe...
Edit: So the issue is apparently caused by CrowdStrike. So, unless the surgery computers also use CrowdStrike then it would be fine. Unless, of course, if they use CrowdStrike on surgery computers...
I'd heard some hospitals were affected. They cancelled appointments and non-critical surgeries.
I'm guessing it was mostly their "behind the desk" computers that got affected, not the computers used to control the important stuff. The computers in patients' rooms may have been affected as well, but (at least in the US) those are usually just used to record information about medicine given and other details about the patient, nothing critical that can't be done manually.
Anecdotal, but my spouse was in surgery during the outage and it went fine, so I imagine they take precautions (like probably having a test machine for updates before they install anything on the real one, maybe)
There were no test rings for this one and it wasn't a user controlled update. It was pushed by CS in a way that couldn't be intercepted/tested/vetted by the consumer unless your device either doesn't have CS installed or isn't on an external network.. or I suppose you could block CS connections at the firewall. 🤷♂️
Depending on the machine, I guess it's likely that those aren't using Windoofs at all. I would be surprised if there were devices in use during surgery who run on that.
Good News! Unless something has changed since I worked in healthcare IT, those systems are far too old to be impacted!
I'm half-joking. I don't know what that kind of equipment runs, but I would guess something embedded. The nuke-med stuff was mostly linux and various lab analyzers were also something embedded though they interface with all sorts of things (which can very well be windows). Pharmaceutical dispensers ran various linux-like OS's (though I couldn't even tell you the names anymore). Some medical records stuff was also proprietary, but Windows was replacing most of it near the end of my time.
One place we had ran their keycard system all on a windows 3.1 box still. I don't doubt some modern systems also are running on Windows which has interesting implications for getting into/out of places.
That said, a lot of that stuff doesn't touch the outside internet at all unless someone has done something horribly wrong. Medical records systems often do, though (including for billing and insurance stuff).
I was just watching this show called Connections and the first episode was about a power blackout and it showed how the lights went out during a birth.
Great show it went on about what do you do if the power stays off permanently and how we aren't well prepared for that and how to start a civilization after you kill some farmers and steal their land but non of their tools work without power either and if you know how to mount an old-school plow to oxen
Is there a good eli5 on what crowdstrike is, why it is so massively used, why it seems to be so heavily associated with Microsoft and what the hell happened?
Crowdstrike is an anti-virus program that everyone in the corporate world uses for their windows machines. They released a update that made the program fail badly enough that windows crashes. When it crashes like this, it tries to restart in case it fixes the issue, but here it doesn't, and computers get stuck in a loop of restarting.
Because anti-virus programs are there to prevent bad things from happening, you can't just automatically disable the program when it crashes. This means a lot of computers cannot start properly, which means you also cannot tell the computers to fix the problem remotely like you usually would.
The end result is a bunch of low level techs are spending their weekends manually going to each computer individually, and swapping out the bad update file so the computer can boot. It's a massive failure on crowdstrikes part, and a good reason you shouldn't outsource all your IT like people have been doing.
It's also a strong indicator that companies are not doing enough to protect their own infrastructure. Production servers shouldn't have third party software that auto-updates without going through a test environment. It's one thing to push emergency updates if there is a timely concern or vulnerability, but routine maintenance should go through testing before being promoted to prod.
Really there's a sub-joke here about how, because no one ever bothers scanning their Mac for viruses since they think they're virus-proof, all the Macs are functioning as the virus farms they've been for quite some time.
Crowdstrike is a cybersecurity company that makes security software for Windows. It apparently operates at the kernel-level, so it's running in the critical path of the OS. So if their software crashes, it takes Windows down with it.
This is very popular software. Many large entities including fortune 500 companies, transport authorities, hospitals etc. use this software.
They pushed a bad update which caused their software to crash, which took Windows down with it on an extremely large number of machines worldwide.
They make security software for every OS. My company has it running on our Macs, and Linux servers as well. It just happened to only break windows because that’s what they released the update for.
if that's a "good" argument for you, then i've already heared that, and it nearly never really fits. here is another one for you that is an argument as generic as yours:
"maybe try eating poo, trillions of flies cannot be wrong, poo is VERY popular food, much more popular than any human food !!! (as in mass per day as well as in its number of consumers)"
well maybe letting them pay compensation to all(!) victims (not just their customers) for all losses including lost time already would solve that problem.
that would leave the decades-long unsolved problem of microsoft not beeing held liable for their buggy products (which is the reason for all security-products-as-a-workaround-to-compensate-that-crappy-os companies existance) open.
why not in general hold companies liable for the damage they cause so they CAN develop beeing more cautious with what they do?
i mean not ONLY cs should be sued to hell, but ALL of them should be sued until they are reasonable cautious with all possible damages they can cause (and already did in the past)
It's not Microsoft's fault a third party company wrote a kernel module that crashes the OS.
Unlike the mobile world where apps are severely limited and sandboxed, the desktop is completely the opposite. Microsoft has tried many times to limit what programs can do, but encountered a lot of resistance and ultimately had to let it go.
From what I've read, it sounds like the update file that was causing the problems was entirely filled with zeros; the patched file was the same size but had data in it.
My entirely speculative theory is that the update file that they intended to deploy was okay (and possibly passed internal testing), but when it was being deployed to customers there was some error which caused the file to be written incorrectly (or somehow a blank dummy file was used). Meaning the original update could have been through testing but wasn't what actually ended up being deployed to customers.
I also assume that it's very difficult for them to conduct UAT given that a core part of their protection comes from being able to fix possible security issues before they are exploited. If they did extensive UAT prior to deploying updates, it would both slow down the speed with which they can fix possible issues (and therefore allow more time for malicious actors to exploit them), but also provide time for malicious parties to update their attacks in response to the upcoming changes, which may become public knowledge when they are released for UAT.
There's also just an issue of scale; they apparently regularly release several updates like this per day, so I'm not sure how UAT testing could even be conducted at that pace. Granted I've only ever personally involved with UAT for applications that had quarterly (major) updates, so there might be ways to get it done several times a day that I'm not aware of.
None of that is to take away from the fact that this was an enormous cock up, and that whatever processes they have in place are clearly not sufficient. I completely agree that whatever they do for testing these updates has failed in a monumental way. My work was relatively unaffected by this, but I imagine there are lots of angry customers who are rightly demanding answers for how exactly this happened, and how they intend to avoid something like this happening again.
Not saying Windows isn't trash, but considering what CrowdStrike's software is, they could have bricked Mac or Linux just as hard. The CrowdStrike agent has pretty broad access to modify and block execution of system files. Nuke a few of the wrong files, and any OS is going to grind to a halt.
That's... Not great. I didn't actually think about what all these wild AV systems could do, but that's incredibly broad access.
Maybe I'm just old, but it always strikes me as odd that you'd spend so much money on that much intrusive power that on a good day slows your machines down and on a bad day this happens.
I get that Users are stupid. But maybe you shouldn't let users install anything. And maybe your machines shouldn't have access to things that can give them malware. Some times, you don't need everything connected to a network.
The impacted Channel File in this event is 291 and will have a filename that starts with “C-00000291-” and ends with a .sys extension. Although Channel Files end with the SYS extension, they are not kernel drivers.
Except not make an OS so shitty and vulnerable that it needs millions of hours and billions of dollars pumped into keeping it from being hacked in a split second. But yes nothing besides that one minor thing.
This is specifically caused by an update for CrowdStrike's Falcon antivirus software, which is designed for large organizations. This won't affect personal computers unless they've specifically chosen to install Falcon.
A brick (or bricked device) is a mobile device, game console, router, computer or other electronic device that is no longer functional due to corrupted firmware, a hardware problem, or other damage.[1] The term analogizes the device to a brick's modern technological usefulness.[2]
Edit: you may click the tiny down arrow if you think it can't. ;)