Right, so as the discussion about PlayStation 5 and Xbox Series X rages, the discussion around teraflops seems to, for the most part, lack specificity. What do these numbers mean, Mason?Before we begin, I just want to remind you all that partial credit is available to students who can at least show me how they reached their conclusion. If you put the answer down without explanation and it's wrong ... I will award you no points and may God have mercy upon your soul.
TERA-FLOP/S:
TERA - Metric system unit prefix that indicates multiplication by 10^12 or 1,000,000,000,000 (1 Trillion with a capital T); Derived from Greek TERAS (τέρας), which can be taken to mean "monster"
I can't quite put my finger on where I've heard that before ... anyways, what'd you have for breakfast?
Did not I ask you to show your work?Cool, so Xbox Series X can do 12.1 Trillion floating-point operations per second while the PlayStation 5 can to 10.3 Trillion floating-point operations per second. So that's just a *checks notes* 16% difference.
A teraflop is function. Do you remember what a function is? Mathematically speaking, a function is a relationship or expression involving one or more variables. Much of our TF/s discussion is missing study of these variables. So let us look at the inputs that go into the output of the TF/s we love so dearly.
TF/s = # of compute units * 64 shaders * 2 instructions per cycle (or clock) * clock speed/frequency
Let me write this out on the whiteboard:
XSX: 52 compute units * 64 shaders * 2 IPC * 1.825 = ~12.1 TF/s
PS5: 36 compute units * 64 shaders * 2 IPC * 2.230 = ~10.3 TF/s
Why yes, that is an astute observation there, student!Ahh hell! Where did all these new numbers come from? You mean we can't just boil this whole discussion down to 1 simple-to-understand number?!
You see, in order to calculate computing power, you need to include the workforce behind the computing. What's pushing these numbers? That would be the shaders (or shading units). Compute units (CU) are just clusters of 64 shaders each. These shaders are involved in rasterization. What's rasterization you ask?
Back to the whiteboard:With rasterization, objects on the screen are created from a mesh of virtual triangles, or polygons, that create 3D models of objects. In this virtual mesh, the corners of each triangle — known as vertices — intersect with the vertices of other triangles of different sizes and shapes. A lot of information is associated with each vertex, including its position in space, as well as information about color, texture and its "normal," which is used to determine the way the surface of an object is facing.
Computers then convert the triangles of the 3D models into pixels, or dots, on a 2D screen. Each pixel can be assigned an initial color value from the data stored in the triangle vertices.*3
XSX: 52 compute units * 64 shaders per compute unit = 3328 shaders
PS5: 36 compute units * 64 shaders per compute unit = 2304 shaders
A different picture begins to form now. We are looking at a 36% shading gulf between the two console GPUs. Think of this situation as two factories: Factory X has 3,300 workers, Factory P has 2,300 workers. Factory X is going to do more work. Obviously. We are in agreement there. It's obvious. Right? Just to reiterate, you cannot magically make up for 1,000 missing workers. These numbers cannot lie.
But wait, we are still missing some context. What about the IPC and how do we account for the clock speed? Let's continue with the factory analogy. We must factor in how quickly these workers ... work.
Assume that each worker at Factories X and P is capable of packing 2 boxes per shipment (instructions per clock). Factory P's workers pack those shipments faster (2.23 GHz clock speed) while Factory X's shipment processing speed is slower (1.825 GHz clock speed). A downside to Factory P's speed is that the workers move so quickly, there may be moments of productivity dips (variable frequency) where Factory X's workers have a more standard pace (locked frequency).
Well, you would have to read what the Xbox team published for their XSX specs, you would have to pay attention to Mark Cerny's lecture on PS5, and you would have to read Digital Foundry's excellent write-ups on both consoles as well. Let's face it. We have short attention spans aaaaaand that's a lot of research aaaaaand it's easier to go X is bigger than Y therefore X is better, full stop.Hold the phone, where was all of this information in the soundbites on YouTube or the hot takes on Twitter and Reddit and here?!
And mind you, this is to say nothing about how ROP count (render output units) or TMU count (texture mapping units) contribute to overall factory productivity. But it does help frame and contextualize our discussion more appropriately.
What We Learn?
I've been following the TF/s discussion since it rose to prominence in late 2012/early 2013 with the PS4 GPU leaks. Over the years, many technical experts and developers have talked, in passing and in detail, about the importance of the variables that make up the "teraflop." Shaders and frequency are important to a rendering pipeline, which is why they must be considered when talking about TF/s.
It seems clear to me that the Xbox and PlayStation groups had different design priorities when it came to designing their consoles.
Xbox Series X isn't going to approach things with a mere "brute force" approach. Perhaps some forget that consoles have never been about brute force; that is PC GPU domain, and NVIDIA will be demonstrating that with the sheer power of the RTX 3000 line eventually. The engineers behind the XSX have already done basic outlining of the optimizations and specializations that can be found in the upcoming console. Let's not forget the work MS and AMD have done together on the machine. XSX is as strong as is it because Microsoft seems poised to offer a lower-end SKU.
In the same breath, M. Cerny and the PlayStation 5 engineers clearly centered their vision for the console around an absurdly fast customized NVMe SSD and their impressive Tempest Audio engine. PlayStation 5 also has a number of specializations and optimizations that give it room to maneuver as well, as Cerny so eloquently described in his THE ROAD TO PS5 lecture yesterday.
I hope that I made the case that the TF/s discussion requires more nuance.
Since both consoles are using very similar parts from AMD and are direct competitors, technical comparisons are appropriate. It is obvious at just a glance that XSX is stronger from what we understand for traditional rasterization and ray-tracing, but as always is the case, we will see how these machines do things in practice.
NVMe storage is being hailed as a paradigm-shift for game design. If it is, the team that bet heaviest on it could make significant gains. At the same time, raytracing is the HOLY GRAIL for lighting and as advancements are made, the team that has more resources in place for it could make heavy gains as well.
Some developers, both 1st and 3rd party, are going to get more from both AND either of these machines than others.
At the same time, holistically speaking, we're coming off of 8th generation consoles with weak CPUs and HDDs and upgrading to 9th generation consoles that BOTH have custom NVMe SSDs, strong Zen 2 8-core SMT-enabled CPUs, RDNA2 GPUs, and fast GDDR6 RAM.
That alone is worthy of community celebration, methinks. Thanks for reading!
Footnotes:
*1: The Scientist and Engineer's Guide to Digital Signal Processing: Chapter 28: Digital Signal Processors - Steven W. Smith, Ph.D.
*2: Fixed-Point vs. Floating-Point Digital Signal Processing
*3: What's the Difference Between Ray Tracing and Rasterization? - Brian Caulfield
Percentage Difference formula: The absolute value of the change in value, divided by the average of the 2 numbers, all multiplied by 100.
E.G. [ |12.1 - 10.3| / (12.1 + 10.3) /2 ] * 100
Last edited: