Mystery Intel bug halts shipments of some Sapphire Rapids Xeons
Chipzilla's not saying much other than 'commercial software' not affected
Intel's 4th generation Xeon Scalable processors arrived behind schedule when the silicon, codenamed Sapphire Rapids, debuted in January 2023. Now the x86 giant has paused shipments of some chips in that family due to a fault with the components.
In a statement to The Register, Intel said the issue specifically affects medium-core-count (MCC) SKUs, which range from eight to 32 cores, and could "interrupt system operation under certain conditions."
The sparse statement adds: "Out of an abundance of caution, we did temporarily pause some SPR MCC shipments while we gained confidence in the expected firmware mitigation and expect to release remaining shipments shortly."
The blunder was first reported by Tom's Hardware, which quoted Intel as saying the issue did not appear when the processors were "running commercially available software."
That suggests the gremlin could be non-disruptive for most customers, though Chipzilla's near-silence means the gravity of the situation is unclear. The mere fact that Intel has paused shipments of an already delayed chip family – a step not taken lightly – suggests it is non-trivial.
That said, Intel (and other chip designers) often issue errata notices for their processors, pointing out weird, non-standard, or very rare circumstances when code can trigger a silicon-level bug. These can typically be fixed at the firmware, operating system, or microcode level, or by using a later revision of the silicon. These notices don't usually involve a stop on shipments.
Whatever the nature of the issue, it doesn't appear to impact Intel's higher core count parts nor its high-bandwidth-memory-equipped Xeon CPU Max family (XCC and HBM). That's good news for Argonne National Lab's Aurora Supercomputer, which uses both Intel's 4th-gen Xeon Max processors and Ponte Vecchio GPUs, recent completed installation of those parts.
Overall, Intel's 4th-gen Xeon Scalable processors appear not to be as "healthy" as Lisa Spelman, VP of Intel's Xeon business claimed back in March.
- Memory chipmaker Micron's sales down 57% as market bottoms out
- Samsung to start mass producing 2nm silicon in 2025, first for mobile devices
- It's time to mark six decades of computer networking
- US mulls tightening ban on AI chips to China
News of a bug with the fourth-gen Xeon Scalable isn't surprising, considering just how many headaches Sapphire Rapids has caused Intel over the past few years.
In Intel's defense, the chip is packed full of new technologies, including support for DDR5, PCIe 5.0, CXL 1.1, a chiplet architecture that boosted core counts to 60, and a slew of dedicated accelerators for everything from data analytics and cryptography to machine learning.
The chips were slated for release in 2021 but faced repeated setbacks as Intel worked out the kinks in the design and had to overcome poor yields on its Intel 7 (10nm) process node. The chip eventually launched into a market in which its core count placed Intel behind AMD, Amazon, and Ampere.
But if Intel's recent benchmark release is to be believed, the chip can go toe to toe with AMD's latest Epycs in a straight core-to-core battle.
The delays to Sapphire Rapids mean that we won't have to wait much longer for its successor: Emerald Rapids is due out in Q4 2023.
The forthcoming chip is essentially a refresh of its predecessor and will be drop-in compatible with 4th-Gen Xeon boards. The processor is also expected to deliver higher core counts, better performance per watt, and presumably fewer bugs. ®