Nội dung text Bits and Brews-Article1 draft
Bits and Brews Hi, I'm xyz, a 4th-year Electrical Engineering student at xyz. I've always been passionate about computer architecture, especially the cycle of discovering a new design choice: first being astonished by its logical simplicity, then analyzing the trade-offs to see if it truly fits an application. That cycle is what excites me, and it's the core of what I want to explore here at Bits and Brews. That's why for this inaugural post, we'll explore the fascinating journey of Qualcomm's Oryon CPU: a processor that started with server ambitions and achieved immense success in the consumer world. Computer architecture is all about applying common sense and analyzing trade-offs. Qualcomm recently launched the fastest mobile processor, the Snapdragon 8 Elite Gen 5, but did you know it consumes about 60% more power than Apple’s processors? Every design choice involves trade-offs—gaining something often means sacrificing something else. With that in mind, I decided to start this blog to share fascinating journeys of the Oryon CPUs along with in-depth analysis behind some of their architectural decisions. So, grab a coffee, relax, and enjoy the reading. Oryon is Qualcomm’s new custom CPU core. It didn’t start at Qualcomm, it was created by NuVia, a startup founded by ex-Apple and other CPU veterans. Their original project (“Phoenix”) was a server CPU to compete with Intel Xeon, AMD EPYC, and Arm’s Neoverse designs. Qualcomm bought NuVia in 2021, repurposed Phoenix for consumer hardware, and renamed it Oryon. Current Oryon cores use Arm v8.7-A ISA (Instruction Set Architecture).Since Phoenix was meant for servers, the first Oryon cores still carry many of those server-style design choices. Qualcomm will tweak them for client devices (phones, laptops, tablets) in future generations. Phoenix was built to go toe-to-toe with high-performance CPUs (Xeon/EPYC). This means Oryon cores are unusually wide, powerful, and scalable compared to typical mobile-first designs. Mobile/consumer chips care about burst performance, thermal limits, and idle efficiency even more than servers do.First-gen Oryon (Snapdragon X) inherits a lot from Phoenix but lacks some “consumer-specific” refinements. So hopefully they made most of the required changes in the Third-gen Oryon processors. 1st generation Development of the first generation of Oryon started in 2021 under Nuvia. This generation consists of Snapdragon X-series chips that are targeted at laptops.
Figure 2 - Oryon M & L clusters of Snapdragon 8 elite gen 5 Both the L and M Core clusters benefit from L3 cache optimizations. While the L3 cache capacity remains at 12MB, its area has actually been reduced by 0.7 sq mm in the L Core cluster, indicating a significant 13% increase in cache density. This means more data can be stored in a smaller physical space, leading to faster access and improved efficiency. Deep dive into the Microarchitectural insights: Since Qualcomm has yet to release a full architectural overview of the Snapdragon 8 Elite Gen 5, our best reference point is the X series architecture overview presented by Williams. Although there might have been advancements in 3rd gen Oryon CPUs microarchitecture, the X series gives us enough detail to understand the kind of design choices modern processors make. Using this as our guide, we can start breaking down what design choices are being made in the modern processors which most of the textbooks fail to deliver. The X series design is closer to AMD’s Zen or Intel’s old Core designs (homogeneous cores). For this they didn’t follow the typical “big vs little” split (like ARM big.LITTLE or Intel’s P-cores + E-cores). All clusters use the same Oryon cores. The X series design was in a way that only 2 cores (in different clusters) can hit the very top single-core turbo frequency. The others are capped at a lower “all-core turbo” speed. Each cluster has its own PLL (phase-locked loop), which controls its clock speed.That means clusters can run at different frequencies, or even be completely powered off. In practice, when workloads are