At first I was somewhat sceptical about these reports regarding critical instability with 13/14 gen intel desktop CPUs. Not saying there wasn't an issue, but the "tech news cycle" does have a tendency for sensationalism and drama.
But I am starting to wonder if there is some sort of fundamental problem with the 13/14 gen at least for consumer CPUs (I can't imagine something like this would be tolerated in their Xeon server line).
I can tell you my 13700KF could be made to 100% crash on stock motherboard settings on a multicore stress test, at least pre-patch. I manually downclocked it and got it to behave. Post-patch it could meet the spec while staying stable, but the thermals were out of control, so I went back to my manual downclocked setup with stricter power limits.
The absolute kicker is that on synthetic benchmarks I lose barely 10% performance and gain 20-30 degrees C and 80W of power consumption. In real usage the only reason it's noticeable is my PC doesn't sound like a jet engine. The absurdity of it all.
This is crazy to me that this kind of behaviour isn't found during testing. A big part of your market is gaming, you know well that famous YouTubers will stress test the shit out of your chips and still, these issues are on two gen of chips.
At this point, this has to be a known issue at Intel, a critical one, but they still ship the chip.
I'm worried it's a lithography issue. The recent GamersNexus and Level1Techs coverage seems to point that way, at least to me. For example, they mention CPUs working fine for a while, then suddenly becoming increasingly unstable. There's so much that can go wrong with lithography that could cause that kind of behavior, and we know they've been having issues given the whole 14nm debacle.
It doesn't sound like the issue extends beyond a small number of part SKUs. It's probably more of a design flaw based on the data from Steve and Wendell. I'm waiting to see if Intel tries to gaslight everyone or actually does something honorable to resolve the issue.
It sounds like 12th Gen parts are fine and nobody is talking about other SKUs beyond the top end enthusiast parts. If someone finds a way to make a program to test for the issue that could be exciting to see how broad the problem is.
When you say a lithography issue are you concerned that everything coming from that fab line is affected? Or did you just mean the physical design has a flaw that can't be fixed in microcode / BIOS?
It's not the cooling, according to GN also server SKUs running on stock have massive failure rates. Check out their recent video. Here is a link :)
https://www.youtube.com/watch?v=oAE4NWoyMZk
Intel's tests are all the same. They require every power supply maker to follow exacting tests with their hardware. They write the tests, they determine the answers, and they make a test jig that scores how well you did. You are absolutely not allowed to do anything else. I know this for a fact.
This is the type of scenario that can be created. Test cases and all the board makers rely on average spec. Intel tests to 1 spec. There are huge gaps in testing that Intel assumes is OK.
AMD does the same thing but lets you screw around more right now.