NX Gamer: PS5 Full Spec Analysis | A new generation is Born

ty_hot · Mar 25, 2020

M3rcy said:
Not at all. When running a game the clock only adjusts when power limits are reached (under high load) they won't lower on low load like a PC might, in fact they'll be at highest clocks in those situations. If devs want the system to not go HAM in a menu screen, they should cap the framerate. :)

alright, im going back and forth in "ok I understand it!" and "shitty it makes no sense" lol

Jonnax · Mar 25, 2020

ty_hot said:
alright, im going back and forth in "ok I understand it!" and "shitty it makes no sense" lol

Some games have an uncapped framerate on menus therefore it can reach something like 300 frames per second.

So the fans get loud. The solution that Cerney talked about was that the cool be designed to dissipate a fixed amount of heat and they will then vary the clock speeds so that the dissipated heat is what they expect the cooler can do reasonably.

Weather this work is a different question

Litigator · Mar 25, 2020

If the architecture was designed to run fast and at high clock speeds from the get go (which is more believable than the "they increased the clocks at the last minute to react to XSX" theory) I'd be really curious to see how that scales with a mid-gen PS5 Pro running the same clocks with a higher CU count. Could be quite beastly.

Hellshy · Mar 25, 2020

M3rcy said:
Not at all. When running a game the clock only adjusts when power limits are reached (under high load) they won't lower on low load like a PC might, in fact they'll be at highest clocks in those situations. If devs want the system to not go HAM in a menu screen, they should cap the framerate. :)

Cerny gave an example that proves it's not just about uncapped frame rates. I believe he mentioned horizons map menu having a low triangle count causing the ps4 to nconsume more power.

SpectR0nn · Mar 25, 2020

Regarding the whole map screen thing, please read Liabe Brave's latest post. They explain it perfectly. It doesn't have to do with uncapped framerate per say, rather with the amount of saturation of the GPU in terms of workload. Because a map screen is much simpler in terms of game code compared to live gameplay, it is much easier for an engine to saturate the GPU to 100% workload, while in gameplay it typically is more like 95% of saturation or even lower (depening on how highly optimized the engine is) because of inherent inefficiencies due to a high level of parallelism (there's more idling involved because interdependant instructions have to wait on eachother).

So the entire idea of this design is that you deterministicly downclock the GPU frequency during those scenarios with 100% saturation of simple to execute instructions (you dont need as much clockspeed for that), while you can easily and deterministicly upclock the GPU during live gameplay where the saturation is more like 95% with complex and highly parallelised instructions. You can do this because everything is based off a feedback loop around workloads, instead of thermals or power draw (since those have been set in stone beforehand). This also means that the downclocking should never impact ingame performance in unpredictable ways, since you as a developer have way more control over the right clocks for each workload, because it is deterministic.

Mubrik_ · Mar 25, 2020

Liabe Brave said:
Thanks, I stand corrected regarding quad data rate. Overall, I think my larger point stands: that your way of describing things is overprecise for most discussion contexts. A lot of people are confused by simple terms like "bandwidth", so when you start pulling out the niceties of "QDR" and "MT/s" and suggest complex formulas they lose the plot.

My version requires three inputs: bandwidth, RAM data rate, and the fact there are 8 bits per byte. That last fact is comparatively well known. RAM data rate is often written right on the chip, so pictures show it even if people don't know how to derive it. Bandwidth is usually a given number, but can be explained further using the very easy formula [number of chips at once] x 32.

I think sticking to that eliminates confusion. And not just for the less technically-minded--it stops things like your repeating the erro of a "1050 GHz" memory clock, and so forth.

Either way, the system can only move 256 bits at a time. If the CPU needs some of that bandwidth, then it can't be used by the GPU. Hence contention, and lowered GPU bandwidth. On PS4, it seems that somehow this process of CPU access lowered GPU bandwidth by more than just what the CPU was taking. This is out of my depth, so I'm not sure either why that is, or whether it might have been fixed for PS5.

Which is ultimately why it can't reach clockspeeds as high.

What you and many others miss is the scenarios Mr. Cerny referred to as being the usual absolute maximum power draw. We're used to thinking of power draw increasing as game complexity and visual ambition rise. This is true. But past a certain point, that very complexity starts to make games inefficient. Parallelization means work dependent on other unfinished work stalls, which is why you see framerates or IQ drop in demanding sections. Counter-intuitively, the very highest power usage is when complex games run simpler loads. The engine has been optimized to shove as much work through as possible, so when the work is simple--like a menu screen--there are no stalls. Every single transistor ends up in use, and power peaks.

It doesn't need to. Even at a lower clockspeed, there's more than enough compute available to render the menu. But in typical thermal design with fixed clocks, you don't have a choice. The temperature a 100% saturated menu causes at 1.9GHz might match what 95% saturated gameplay draws at 2.2GHz. (Illustrative only, not real data.) But this means a cooling system good enough to cover either situation can't support a 2.2GHz fixed clock. Because a 100% saturated GPU at 2.2GHz will produce too much heat, and you can't have the console shutting off to avoid overheating whenever a player brings up the menu.

Here's where Sony's solution comes in. Remember, it adjusts clock based on the activity of the chip, not temperature. So when the chip becomes 100% saturated by a menu screen or the like, clock goes down. Performance also reduces, but it doesn't matter because you're rendering very little. When the user goes back into gameplay, saturation reduces to 95% and the clock can rise back to 2.2GHz. This is how a cooling system that can't handle sustained 2.0GHz can reach 2.2GHz with Sony's variable approach. Note that all this same logic applies to the CPU as well: due to inherent inefficiencies it should be able to sustain at 3.5GHz in game situations, and only drop with simple workloads that won't be affected.

What about when games hit higher than 95% saturation, though? It appears that Sony's profiling tools suggest this is more likely on the GPU than CPU side, so AMD SmartShift then will allow reduction of CPU provisioning to allow 96% on GPU. There's still a limit, though. Mr. Cerny did acknowledge that at some point, there will come a game that hits 97 or 98% efficiency in gameplay, on both CPU and GPU at the same time. Here, the clock must be reduced, and that game will not have 10.3TF of performance.

But it should take some time (years?) for developers to get to grips with the platform such that they deftly avoid bottlenecks to this extent (such as fully using SMT). By that point, they'll also have come up with less intensive methods to accomplish some of the same shading, so the games will still look better than titles from early in the gen, even though the compute resources are slightly less. And the talk emphasized "slightly". Because of power curves, every percentage point you drop the clockspeed you lower power draw by multiple percent.

All this complexity could be eliminated by just putting in a cooling system capable of handling 100% saturation at 2.2GHz. But this must be huge and prohibitively expensive, because neither platform attempted it. Series X has a novel wind-tunnel design, a huge fan, and a heatsink that takes up a lot of the interior volume, and can't reach even 2GHz fixed. (Neither could Sony, by their own account.)

Good post

Hopefully some people read to understand first.

Pheonix · Mar 25, 2020

Liabe Brave said:
What you and many others miss is the scenarios Mr. Cerny referred to as being the usual absolute maximum power draw. We're used to thinking of power draw increasing as game complexity and visual ambition rise. This is true. But past a certain point, that very complexity starts to make games inefficient. Parallelization means work dependent on other unfinished work stalls, which is why you see framerates or IQ drop in demanding sections. Counter-intuitively, the very highest power usage is when complex games run simpler loads. The engine has been optimized to shove as much work through as possible, so when the work is simple--like a menu screen--there are no stalls. Every single transistor ends up in use, and power peaks.

It doesn't need to. Even at a lower clockspeed, there's more than enough compute available to render the menu. But in typical thermal design with fixed clocks, you don't have a choice. The temperature a 100% saturated menu causes at 1.9GHz might match what 95% saturated gameplay draws at 2.2GHz. (Illustrative only, not real data.) But this means a cooling system good enough to cover either situation can't support a 2.2GHz fixed clock. Because a 100% saturated GPU at 2.2GHz will produce too much heat, and you can't have the console shutting off to avoid overheating whenever a player brings up the menu.

Here's where Sony's solution comes in. Remember, it adjusts clock based on the activity of the chip, not temperature. So when the chip becomes 100% saturated by a menu screen or the like, clock goes down. Performance also reduces, but it doesn't matter because you're rendering very little. When the user goes back into gameplay, saturation reduces to 95% and the clock can rise back to 2.2GHz. This is how a cooling system that can't handle sustained 2.0GHz can reach 2.2GHz with Sony's variable approach. Note that all this same logic applies to the CPU as well: due to inherent inefficiencies it should be able to sustain at 3.5GHz in game situations, and only drop with simple workloads that won't be affected.

What about when games hit higher than 95% saturation, though? It appears that Sony's profiling tools suggest this is more likely on the GPU than CPU side, so AMD SmartShift then will allow reduction of CPU provisioning to allow 96% on GPU. There's still a limit, though. Mr. Cerny did acknowledge that at some point, there will come a game that hits 97 or 98% efficiency in gameplay, on both CPU and GPU at the same time. Here, the clock must be reduced, and that game will not have 10.3TF of performance.

But it should take some time (years?) for developers to get to grips with the platform such that they deftly avoid bottlenecks to this extent (such as fully using SMT). By that point, they'll also have come up with less intensive methods to accomplish some of the same shading, so the games will still look better than titles from early in the gen, even though the compute resources are slightly less. And the talk emphasized "slightly". Because of power curves, every percentage point you drop the clockspeed you lower power draw by multiple percent.

All this complexity could be eliminated by just putting in a cooling system capable of handling 100% saturation at 2.2GHz. But this must be huge and prohibitively expensive, because neither platform attempted it. Series X has a novel wind-tunnel design, a huge fan, and a heatsink that takes up a lot of the interior volume, and can't reach even 2GHz fixed. (Neither could Sony, by their own account.)

Damn man, that's a great post. And explains the "variable frequency" thing perfectly. unfortunately, it would get ignored by those championing this notion that sony did some last-minute panic clocking. Truly a disservice to the work they have put into thier technique.

Alexandros · Mar 25, 2020

nelsonroyale said:
Isn't it? It means you can re-direct that to the CPU with bigger relative gains maybe?

That's what AMD's Smartshift does, I don't see any relation with the chosen frequencies.

Deleted member 10847 · Mar 25, 2020

Pheonix said:
Damn man, that's a great post. And explains the "variable frequency" thing perfectly. unfortunately, it would get ignored by those championing this notion that sony did some last-minute panic clocking. Truly a disservice to the work they have put into thier technique.

Technique? Something that PC hardware does for years? In my personal opinion is still an afterthought.

Also the post you quoted has several flaws, like, it will not take years until the games can make the PS5 work 100% on GPU load, a bad optimized game can do that from day 1, and that will definitely happen a lot during the lunch period (and after, even today both XB and PS4 has a fair share of them).

The silicon is still AMD silicon like any other silicon it cannot sustain max boost during max power usage. See any GPU currently on the marked.

ThatNerdGUI · Mar 25, 2020

M3rcy said:
That they can do this without blowing their power budget and having to have massive power delivery/cooling and without making devs work significantly harder?

Both users and developers get more performance than they would if they didn't do this. And what's clever about it is exactly what I said. It's common for CPUs/GPUs to clock up under low load when there is power and thermal headroom available and clock down when power/thermals exceed a threshhold. What's different with the PS5 is to have your clock changes vary in a predictable and consistent way.

I don't know why people think this won't make devs work harder. This approach will make games harder to optimize. What happens when you have a complex scene that requires high CPU usage at the same time? Now you have to make changes or a decision in which to prioritize because you can't just run both CPU and GPU at the max rated clocks and I/O throughput can't fix this.

Lady Gaia · Mar 25, 2020

Alexandros said:
Doesn't it seem incredibly wasteful though to spend 10% of the system's total power budget, on a platform that is bound to be constrained by power limits, just to add a couple of percentage points to the GPU's already stratospheric clocks?

Was it wasteful on the PS4, the Xbox One X, or the Series X? Because that's how power scaling works on every piece of modern silicon at the clock speeds these things ship at. It was true of the PPC cores on the Xbox 360 and PS3, too.

Lady Gaia · Mar 25, 2020

ThatNerdGUI said:
I don't know why people think this won't make devs work harder.

Because there are already so many factors that make can code performance unintuitive, meaning that the only way to know for sure is to test it. You already run something, measure what the limiting factors are, tweak the design and try again to see if it's an improvement across a range of use cases. That's profiling in a nutshell, and the practice doesn't change just because there's one more variable. In fact, it probably disappears in the noise because we're looking at a couple of percent difference - whereas poor reference locality or another core thrashing the cache results in orders of magnitude difference in performance.

Alexandros · Mar 25, 2020

Lady Gaia said:
Was it wasteful on the PS4, the Xbox One X, or the Series X? Because that's how power scaling works on every piece of modern silicon at the clock speeds these things ship at. It was true of the PPC cores on the Xbox 360 and PS3, too.

Source please?

GhostTrick · Mar 25, 2020

Lady Gaia said:
Was it wasteful on the PS4, the Xbox One X, or the Series X? Because that's how power scaling works on every piece of modern silicon at the clock speeds these things ship at. It was true of the PPC cores on the Xbox 360 and PS3, too.

Not really. They were aiming for rather conservative clocks. Usually at the sweetspot between performance and power consumption.

SpectR0nn · Mar 25, 2020

cyenz said:
Technique? Something that PC hardware does for years? In my personal opinion is still an afterthought.

Also the post you quoted has several flaws, like, it will not take years until the games can make the PS5 work 100% on GPU load, a bad optimized game can do that from day 1, and that will definitely happen a lot during the lunch period (and after, even today both XB and PS4 has a fair share of them).

The silicon is still AMD silicon like any other silicon it cannot sustain max boost during max power usage. See any GPU currently on the marked.

This is not what PC hardware has done for years and several people have already explained this quite clearly in this very thread. For PC hardware, you do not work with a set power and thermal limit beforehand for the entire system. The boost is not constantly engaged, while the PS5 runs in constant boost essentially.

marecki already explained quite clearly how it works for PS5 and why it is such a different approach compared to anything we're used to:

You are give a set number of Gpu compute units, say 40 and you have now two options how build the system around them
Option 1 (traditional and what Xbox is using): unlock the power draw, try out different clocks until you narrow it down to a clock which 99.9% will not overheat your console even if the gpu is running an extremely power hungry workload. The frequency is then locked and the power fluctuates based on workload. You are still left with a variable power draw and hence variable thermals.

Option 2 (what psv is doing): lock your power to a maximum your cooling can handle, then aim and test for a maximum clock which for 95% of the time comes under your power and for remaining 5% workload scenarios design it in the way that clock can be lowered to accommodate that and the reduction in clock doesn't need to be significant.

Now the main difference is that in option 1 your clock speed needs to accommodate 99.9% if not 100% cases for how your workloads affect the power draw and thermals, as the power draw is basically unlocked your thermals are only predictable to a degree so you need to be quite conservative with your clocks.

In the option 2 power and thermals are capped and frequency is unlocked so the system utilisation becomes more deterministic and you can be much more aggressive with your clocks.

This is not an afterthought at all. It was very deliberately designed this way to achieve three things:
1. be able to reach higher clocks than you would normally be able to (in a sustained manner)
2. get tighter control over your cooling solution, so you can reduce the average noise production of the console
3. improve consistency across every console produced due to the deterministic nature

This implementation could never be replicated on PC, because the PC platform is inherently not a closed box design like a console and it has never been done on consoles before because well, someone has to be the first one to do it right?

Also, it seems you are conflating what they explained about GPU load. In the current paradigm, you would see a GPU and/or CPU reach 100% load especially in badly optimized games. This is completely seperate from the actual complexity and amount of workload that needs to be processed at that time. It could be many simple instructions (like the Horizon map screen where power draw maximizes) that could easily be processed with a lower clockspeed without any performance penalty, or it could be a lot of complex instructions that cannot be processed all at once because there's more idle time in between due to the parallel nature of the instructions. This is what you would see during actual gameplay and due to this, you never reach an actual 100% saturation, but more like 95% (in a badly optimized game, this percentage is possibly even much lower than that).

In summary, because the PS5 will determine it's clocks based on the actual workload (the APU is designed in such a way that there's a constant feedback loop from both CPU and GPU), it will constantly change it's clockspeeds based on that. Which is completely different from any other implementation before this (be it PC or console), where the throttling is based off thermal conditions and power draw. The thermal and power draw is not a factor in this for the PS5, because both limits have been determined beforehand, so only the actual workload matters.

So you are right in saying that the silicon cannot sustain max boost during max power usage, because in most cases it will not be doing that on the PS5. It will either sustain max boost during lower power usage, or go below max boost during max power usage and anything in between. Because clocks are purely based on live workloads, developers can determine and control this beforehand through their testing and optimization phases. Again, all because the max power is a known factor and the system dynamically throttles both CPU and GPU based on workloads you throw at it. It's a very different approach to what we are used to.

Lady Gaia and Liabe Brave can explain this better and with more detail than I ever could. It seems like they are devs themselves, so I would assume that they know what they are talking about. Anyway, that's what I caught about this design based on their posts and several others, as well as Cerny's presentation. It really is a completely different approach to what we're used to, so we should stop comparing this solution to any other approach as well.

ThatNerdGUI · Mar 25, 2020

Lady Gaia said:
Because there are already so many factors that make can code performance unintuitive, meaning that the only way to know for sure is to test it. You already run something, measure what the limiting factors are, tweak the design and try again to see if it's an improvement across a range of use cases. That's profiling in a nutshell, and the practice doesn't change just because there's one more variable. In fact, it probably disappears in the noise because we're looking at a couple of percent difference - whereas poor reference locality or another core thrashing the cache results in orders of magnitude difference in performance.

The thing is performance profiling tools will lock your HW frequency (at least the ones I've used before, not sure if consoles are any different - I assume not-). We don't know at which point the system is "balanced" and CPU won't run at 3.5GHz while the GPU is at 2.23GHz. Now you have multiple scenarios to deal with when both your CPU and GPU frequency are varying based on load and power budget. I agree it would need to be tested, but based on experience I'll say the "couple percentage" it's heavily downplayed.

Lady Gaia · Mar 25, 2020

GhostTrick said:
Not really. They were aiming for rather conservative clocks. Usually at the sweetspot between performance and power consumption.

The 3.2GHz cores were not conservative for their time unless you're comparing those devices with pretty pricy alternatives. The fact that power scales exponentially with clock is not some big secret and there is no magic sweet spot where this isn't true. At lower clock speeds it's possible to keep a constant voltage and merely ride a power of two curve, but as Intel has noted with current leakage and other effects alongside higher frequencies it's getting a lot closer to a power of three. Yes, you can find a "sweet spot" where you're closer to scaling with the square of the clock speed, but that still means that you get less than 5% performance increase for 10% additional power. Once it reaches cube scaling it's closer to 3% increase for each 10% in additional power.

This isn't new. We've been pushing clock speeds as high as we can get away with forever, because it keeps die sizes and therefore manufacturing costs down for the performance levels.

Calabi · Mar 25, 2020

ThatNerdGUI said:
I don't know why people think this won't make devs work harder. This approach will make games harder to optimize. What happens when you have a complex scene that requires high CPU usage at the same time? Now you have to make changes or a decision in which to prioritize because you can't just run both CPU and GPU at the max rated clocks and I/O throughput can't fix this.

Like I mentioned before that would be really difficult, making something that uses the maximum 8 cores 16 thread all of the time or even some of the time would be really difficult. Your always going to end up with gaps and moments of down time, and threads and cores not being utilised as much if at all. The only things that max all the cores would be benchmarks and software that is designed to max the cpu. Someone posted a picture of the profiler of Spiderman, you can see it has lots of gaps, and plenty of underutilised threads.

Ordinary triple AAA games will likely come no where near to fully utilising the CPU, probably the only things that might are sim games that are designed to chunk through data.

Lady Gaia · Mar 25, 2020

ThatNerdGUI said:
The thing is performance profiling tools will lock your HW frequency (at least the ones I've used before, not sure if consoles are any different - I assume not-). We don't know at which point the system is "balanced" and CPU won't run at 3.5GHz while the GPU is at 2.23GHz. Now you have multiple scenarios to deal with when both your CPU and GPU frequency are varying based on load and power budget. I agree it would need to be tested, but based on experience I'll say the "couple percentage" it's heavily downplayed.

People keep assuming that what Sony is doing is related to conventional boost clocks which are thermally sensitive. The reason profiling tools have historically offered the option to lock CPU clocks is to be able to get deterministic results (something I'm pretty familiar with given that I led a team that built some of the profiling tools millions of developers have used.) Sony's solution already guarantees deterministic results, so that's entirely unnecessary.

You already have multiple scenarios to deal with in terms of cache and memory contention. Again, this is nothing new. It's impossible to say whether the couple percentage cited is accurate or not because we have no concrete data to base a conclusion on. We have the lead architect's word for it, and until we have any data to suggest otherwise it's all we have to go on. When we know more about the cooling system in use that should help clarify things somewhat, but as I've said often here: it's the games that matter and that should tell us all we need to know about how the system works in practice. The fact that developers are expressing a great deal of enthusiasm is a clue, and the only one I've personally heard from with direct experience seems positively overjoyed with the design decisions.

mordecaii83 · Mar 25, 2020

Lady Gaia said:
People keep assuming that what Sony is doing is related to conventional boost clocks which are thermally sensitive. The reason profiling tools have historically offered the option to lock CPU clocks is to be able to get deterministic results (something I'm pretty familiar with given that I led a team that built some of the profiling tools millions of developers have used.) Sony's solution already guarantees deterministic results, so that's entirely unnecessary.

You already have multiple scenarios to deal with in terms of cache and memory contention. Again, this is nothing new. It's impossible to say whether the couple percentage cited is accurate or not because we have no concrete data to base a conclusion on. We have the lead architect's word for it, and until we have any data to suggest otherwise it's all we have to go on. When we know more about the cooling system in use that should help clarify things somewhat, but as I've said often here: it's the games that matter and that should tell us all we need to know about how the system works in practice. The fact that developers are expressing a great deal of enthusiasm is a clue, and the only one I've personally heard from with direct experience seems positively overjoyed with the design decisions.

Thanks for giving your insight, there's so much misinformation going around and it's nice to see input from someone with real knowledge. :)

Pheonix · Mar 25, 2020

cyenz said:
Technique? Something that PC hardware does for years? In my personal opinion is still an afterthought.

Also the post you quoted has several flaws, like, it will not take years until the games can make the PS5 work 100% on GPU load, a bad optimized game can do that from day 1, and that will definitely happen a lot during the lunch period (and after, even today both XB and PS4 has a fair share of them).

The silicon is still AMD silicon like any other silicon it cannot sustain max boost during max power usage. See any GPU currently on the marked.

Clearly, you don't know what you are talking about.

Especially after saying the bolded and enlarged part.

Deleted member 10847 · Mar 25, 2020

Pheonix said:
Clearly, you don't know what you are talking about.

Especially after saying the bolded and enlarged part.

Well, I'm out, if you think that having a chip that is controlled by power limits is something new it's OK.

Pheonix · Mar 25, 2020

cyenz said:
Well, I'm out, if you think that having a chip that is controlled by power limits is something new it's OK.

t this pint you will be doing us a favor.

Thre is nothing new in the tech space, everything has been done before income way or the other. However, what sony is doing is a first for consoles, same way MS RAM thingy is a first for consoles even though it was something Nvidia did years ago.

It doesn't change the fact that a good deal of work went into it, especially to get that chip closed as high as they have. But keep on thinking that this is all some 48hr reactionary gimmick cobbled together by sony at the last minute.

BradGrenz · Mar 25, 2020

icecold1983 said:
Clocks dropping during low demand areas like menu screens etc have nothing to do with a variable boost clock. Thats a power saving measure and has been in every GPU and CPU for ages now.

You are misunderstanding the context. It is frequently the "low demand" scenes that cause the GPU to overdraw power.

Hellshy said:
Cerny gave an example that proves it's not just about uncapped frame rates. I believe he mentioned horizons map menu having a low triangle count causing the ps4 to nconsume more power.

It's a similar point. The problem in these cases is that the scene is too simple, making one part of the GPU run way faster than it normally would and draw more and more power to do so. This is what causes PS4s to overheat, and why Furmark back in the day would kill a GPU. It is counterintuitive, but complex scenes you would normally describe as "demanding" stress the GPU more evenly, reducing the likelihood of power being overdrawn. That's exactly what you want. Scenes that don't need to be faster can cause the clock speed to throttle avoiding thermal events without impacting the user experience, while actually intense gameplay will stay at max clocks when the user actually will need the performance.

Alexandros said:
The amount of power at any given time doesn't make the chosen design any less wasteful. In your scenarios power consumption could drop by the hefty chunk of 10% by reducing the frequency a couple of percentage points. This means that the whole time the console uses its full frequencies it is using a disproportionately large amount of power in order to keep the clocks a couple of percentage points higher, for a tiny performance benefit.

That's not how power consumption works at all. Power use is not only a function of clock speed.

Issen · Mar 25, 2020

Given the PS5's unified pool of memory, all of it running at insane bandwidth, I wonder how that's going to affect PC versions of multiplats. Yes, the SSD is fast, but I can see pcie4 SSDs catching up if you've got the money.

Having a harder time picturing a DDR5 memory configuration that wouldn't bottleneck a game designed for the PS5's memory bandwidth (or even the Series X's slower CPU memory pool, still significantly more bandwidth than what I expect on PC system memory).

BreakAtmo · Mar 25, 2020

BradGrenz said:
It's a similar point. The problem in these cases is that the scene is too simple, making one part of the GPU run way faster than it normally would and draw more and more power to do so. This is what causes PS4s to overheat, and why Furmark back in the day would kill a GPU. It is counterintuitive, but complex scenes you would normally describe as "demanding" stress the GPU more evenly, reducing the likelihood of power being overdrawn. That's exactly what you want. Scenes that don't need to be faster can cause the clock speed to throttle avoiding thermal events without impacting the user experience, while actually intense gameplay will stay at max clocks when the user actually will need the performance.

It sounds kind of like if a Gatling Gun had a tendency to randomly stop rotating and keep the up its normal fire rate, but out of a single barrel.

BreakAtmo · Mar 25, 2020

Sorry, bad internet made a double post.

ShapeGSX · Mar 25, 2020

Rukumouru said:
Given the PS5's unified pool of memory, all of it running at insane bandwidth, I wonder how that's going to affect PC versions of multiplats. Yes, the SSD is fast, but I can see pcie4 SSDs catching up if you've got the money.

Having a harder time picturing a DDR5 memory configuration that wouldn't bottleneck a game designed for the PS5's memory bandwidth (or even the Series X's slower CPU memory pool, still significantly more bandwidth than what I expect on PC system memory).

PCs generally have more cache in their CPUs. It's not an issue.

Thera · Mar 25, 2020

Rukumouru said:
Given the PS5's unified pool of memory, all of it running at insane bandwidth, I wonder how that's going to affect PC versions of multiplats. Yes, the SSD is fast, but I can see pcie4 SSDs catching up if you've got the money.

Having a harder time picturing a DDR5 memory configuration that wouldn't bottleneck a game designed for the PS5's memory bandwidth (or even the Series X's slower CPU memory pool, still significantly more bandwidth than what I expect on PC system memory).

I haven't enough knowledge about it but that's already the case this gen and it didn't impacted PC at all.
Like other member said, CPU cache is very low on console.

Liabe Brave · Mar 25, 2020

cyenz said:
Also the post you quoted has several flaws, like, it will not take years until the games can make the PS5 work 100% on GPU load, a bad optimized game can do that from day 1, and that will definitely happen a lot during the lunch period (and after, even today both XB and PS4 has a fair share of them).

You've got this exactly backwards. A badly optimized game uses less of a GPU's compute power, not more. Think about what you apply the term to: games that don't look as good, or run as well, as similar games on the same machine. The machine has the same amount of TF in both cases. What changes is that the well-optimized game uses more of the potential power, by keeping all parts of the chip more occupied. A badly optimized games has unfixed bottlenecks that stop work from proceeding as fast as it "should." (Note that optimization isn't necessarily the reason two different games behave differently, I'm assuming for argument that it's the only variance between the examples.)

SpectR0nn said:
Lady Gaia and Liabe Brave can explain this better and with more detail than I ever could. It seems like they are devs themselves, so I would assume that they know what they are talking about.

I am not a dev, an engineer, or have a degree in computer science. I'm an interested (and hopefully at least half-educated) layperson. My statements are based on what I believe I know from reading more authoritative sources, and from listening to public statements by the platform holders. I'd like to think I have a facility with following the logic of their arguments, but you should take what I say as what it is, not as infallible pronouncement.

Lady Gaia has actual expertise. Definitely pay close attention and put great stock in those posts.

beta · Mar 26, 2020

Liabe Brave said:
Either way, the system can only move 256 bits at a time. If the CPU needs some of that bandwidth, then it can't be used by the GPU. Hence contention, and lowered GPU bandwidth. On PS4, it seems that somehow this process of CPU access lowered GPU bandwidth by more than just what the CPU was taking. This is out of my depth, so I'm not sure either why that is, or whether it might have been fixed for PS5.

My understanding is thats how it works on the XSX as well, if you access the faster or slower pool you lock access to the entire bus from any other device from accessing it for the period of the memory transaction. The difference being that the memory transactions will take the same amount of time on the PS5 regardless of where it is but the majority of the memory access by the CPU will likely take longer on the XSX.

Jekked · Mar 26, 2020

I thought that TFLOPS don't matter that much.

So, I still don't get the point of all the overclocking to nightmare level with variable frequencies. Like why?
Cerny says it's only dropping for a couple of percent and only sometimes. Ok.
If it's only a couple of percent, like 2%, why not lock the frequencies?
TFLOPS don't matter anymore and according to Sony it would only to go down roughly 2%... so lock it 2% below.
spend less on cooling and make it way easier for devs.
Why sacrifice so much just for 2% more?

I have the feeling that it actually goes down way more than "just 2%", so that the TFLOPS gain - on paper - is much more than just 2%.
And Sony really wanted to close the TFLOPs GAP. 9 TFLOPS vs 12 TFLOPS just sounds horrible from a marketing perspective
Single digit vs double digit, but 10 vs 12 sounds much better.
And it might have been done very late at this stage, that would explain why we haven't seen the console yet AND why sony talked so late about the ps5 at all. With the PS4, they already showed almost everything by then.
anyway, we will see how it turns out.

RivalGT · Mar 26, 2020

Litigator said:
If the architecture was designed to run fast and at high clock speeds from the get go (which is more believable than the "they increased the clocks at the last minute to react to XSX" theory) I'd be really curious to see how that scales with a mid-gen PS5 Pro running the same clocks with a higher CU count. Could be quite beastly.

With a 54 CU SoC at 2323mz your looking at a 16 TF GPU, at the same clock speed as base PS5 then it be 15.3 TF. I think their high clock speeds are possible because of their narrow design, I'm not sure it be possible with a bigger SoC, at least not on RDNA2, maybe RDNA3.

Alexandros · Mar 26, 2020

BradGrenz said:
That's not how power consumption works at all. Power use is not only a function of clock speed.

Not only, yes. But you do need power to push clocks higher and keep the system stable.

Fredrik · Mar 26, 2020

Jekked said:
So, I still don't get the point of all the overclocking to nightmare level with variable frequencies. Like why?
Cerny says it's only dropping for a couple of percent and only sometimes. Ok.
If it's only a couple of percent, like 2%, why not lock the frequencies?
TFLOPS don't matter anymore and according to Sony it would only to go down roughly 2%... so lock it 2% below.
spend less on cooling and make it way easier for devs.
Why sacrifice so much just for 2% more?

I'm not a dev but I don't see any problems with this as long as they can clock it higher without instability and cooling issues. Microsoft did it with the Xbox One S too.
But if it results in loud fan noise then they're obviously taking some wrong turns here.

STech · Mar 26, 2020

I don't know how people are so susceptible to fan noise.

I live in a warm country and I haven't had any problems with that noise, and the noise seems one of the main reasons PS5 has such a strange design (if not messed).

Muppet of a Man · Mar 26, 2020

Jekked said:
I thought that TFLOPS don't matter that much.

So, I still don't get the point of all the overclocking to nightmare level with variable frequencies. Like why?
Cerny says it's only dropping for a couple of percent and only sometimes. Ok.
If it's only a couple of percent, like 2%, why not lock the frequencies?
TFLOPS don't matter anymore and according to Sony it would only to go down roughly 2%... so lock it 2% below.
spend less on cooling and make it way easier for devs.
Why sacrifice so much just for 2% more?

I have the feeling that it actually goes down way more than "just 2%", so that the TFLOPS gain - on paper - is much more than just 2%.
And Sony really wanted to close the TFLOPs GAP. 9 TFLOPS vs 12 TFLOPS just sounds horrible from a marketing perspective
Single digit vs double digit, but 10 vs 12 sounds much better.
And it might have been done very late at this stage, that would explain why we haven't seen the console yet AND why sony talked so late about the ps5 at all. With the PS4, they already showed almost everything by then.
anyway, we will see how it turns out.

My theory is that the system 's architecture has been engineered in reverse order to maximize i/o throughput from the perspective of the memory subsystem's weakest link (the ssd) and eliminate i/o pipeline bottlenecks all the way to the CPU and GPU. At the downstream end of the i/o pipeline you have the GPU caches now with insane bandwidth, being clocked so high, and a more easily saturated GPU with fewer CUs to allocate to (less parallelization) but still with very high performance due to the high clocks once again.

endlessflood · Mar 26, 2020

STech said:
I don't know how people are so susceptible to fan noise.

I live in a warm country and I haven't had any problems with that noise, and the noise seems one of the main reasons PS5 has such a strange design (if not messed).

Fan noise on the PS4 Pro is ridiculous. When playing RDR2 I've often had to resort to headphones so that I could understand dialog without turning on subtitles.

GhostTrick · Mar 26, 2020

Rukumouru said:
Given the PS5's unified pool of memory, all of it running at insane bandwidth, I wonder how that's going to affect PC versions of multiplats. Yes, the SSD is fast, but I can see pcie4 SSDs catching up if you've got the money.

Having a harder time picturing a DDR5 memory configuration that wouldn't bottleneck a game designed for the PS5's memory bandwidth (or even the Series X's slower CPU memory pool, still significantly more bandwidth than what I expect on PC system memory).

That's litterally 8GB of GDDR5 all over again here.

How did PC games do the last generation with PS4's unified pool of memory ?

Deleted member 23046 · Mar 26, 2020

GhostTrick said:
That's litterally 8GB of GDDR5 all over again here.

How did PC games do the last generation with PS4's unified pool of memory ?

He just asked a question about game design taking account of consoles specificities, including the Series X, there is no need to show up your astrological table and brandish these like a crucifix.

The main difference with 2013 is that brute forcing any games built for consoles was already easy for 2013 PCs, am easiness that naturally translated into marketshare.

Fredrik · Mar 26, 2020

STech said:
I don't know how people are so susceptible to fan noise.

I live in a warm country and I haven't had any problems with that noise, and the noise seems one of the main reasons PS5 has such a strange design (if not messed).

I hate that noise, even playing on the very silent Xbox One X gets me occasionally annoyed after jumping in on Stadia and Geforce Now where things are truly 100% silent.

Alexandros · Mar 26, 2020

Manfred said:
He just asked a question about game design taking account of consoles specificities, including the Series X, there is no need to show up your astrological table and brandish these like a crucifix.

The main difference with 2013 is that brute forcing any games built for consoles was already easy for 2013 PCs, am easiness that naturally translated into marketshare.

Developers will not cut out a third of their target market by coding a game specifically to console standards.

Phil me in · Mar 26, 2020

Jekked said:
I thought that TFLOPS don't matter that much.

So, I still don't get the point of all the overclocking to nightmare level with variable frequencies. Like why?
Cerny says it's only dropping for a couple of percent and only sometimes. Ok.
If it's only a couple of percent, like 2%, why not lock the frequencies?
TFLOPS don't matter anymore and according to Sony it would only to go down roughly 2%... so lock it 2% below.
spend less on cooling and make it way easier for devs.
Why sacrifice so much just for 2% more?

I have the feeling that it actually goes down way more than "just 2%", so that the TFLOPS gain - on paper - is much more than just 2%.
And Sony really wanted to close the TFLOPs GAP. 9 TFLOPS vs 12 TFLOPS just sounds horrible from a marketing perspective
Single digit vs double digit, but 10 vs 12 sounds much better.
And it might have been done very late at this stage, that would explain why we haven't seen the console yet AND why sony talked so late about the ps5 at all. With the PS4, they already showed almost everything by then.
anyway, we will see how it turns out.

I see people are still spouting this nonsense. Yes they changed the clocks last minute and Cerny made up a whole section in his presentation about faster clocks etc just because they had a single digits TF and were scared of the beast. /s

they have been designing and testing the console for years, I doubt they can just throw in a last minute change like that just for marketing.

cerny has a good record and isn't there to lie or spout pr nonsense that's not his job, unlike others who do that.

Kemono · Mar 26, 2020

Alexandros said:
Developers will not cut out a third of their target market by coding a game specifically to console standards.

So outside of exclusive games nobody will use the extra power the x-series brings. And even exclusive games won't do that for a while thanks to their forced xbox one compability...

Can't wait to see the exclusives that use the full power. Hopefully Hellblade 2 will bring the heat.

Deleted member 23046 · Mar 26, 2020

Alexandros said:
Developers will not cut out a third of their target market by coding a game specifically to console standards.

That's a better answer provided from the commercial side, even if platforms sales balances and valuable gains made on it are all different for each publishers, studios and titles. So having consoles exclusives games from third-party isn't to exclude, it's already there but for others reasons than hardware.

Alexandros · Mar 26, 2020

Kemono said:
So outside of exclusive games nobody will use the extra power the x-series brings. And even exclusive games won't do that for a while thanks to their forced xbox one compability...

Can't wait to see the exclusives that use the full power. Hopefully Hellblade 2 will bring the heat.

I wouldn't say that. The extra power will be used, just not in a way that renders a game unplayable on different hardware.

theAllseeingEYE · Mar 26, 2020

Alexandros said:
I wouldn't say that. The extra power will be used, just not in a way that renders a game unplayable on different hardware.

Exactly,
MS studios are becoming incredibly adept at hardware scalability, they will push the games to the limits, while being able to scale them back for lower-spec machines.

It will be interesting to see if SSD starts becoming a requirement. It could still be worth thier investment to do extra engineering to support HDD though, even if those versions require loading screens.

Alexandros · Mar 26, 2020

theAllseeingEYE said:
Exactly,
MS studios are becoming incredibly adept at hardware scalability, they will push the games to the limits, while being able to scale them back for lower-spec machines.

It will be interesting to see if SSD starts becoming a requirement. It could still be worth thier investment to do extra engineering to support HDD though, even if those versions require loading screens.

It will eventually become a requirement, probably as soon as adoption rates of the new generation and SSDs on PC are at a point where it makes business sense.

supernormal · Mar 26, 2020

Liabe Brave said:
Thanks, I stand corrected regarding quad data rate. Overall, I think my larger point stands: that your way of describing things is overprecise for most discussion contexts. A lot of people are confused by simple terms like "bandwidth", so when you start pulling out the niceties of "QDR" and "MT/s" and suggest complex formulas they lose the plot.

My version requires three inputs: bandwidth, RAM data rate, and the fact there are 8 bits per byte. That last fact is comparatively well known. RAM data rate is often written right on the chip, so pictures show it even if people don't know how to derive it. Bandwidth is usually a given number, but can be explained further using the very easy formula [number of chips at once] x 32.

I think sticking to that eliminates confusion. And not just for the less technically-minded--it stops things like your repeating the erro of a "1050 GHz" memory clock, and so forth.

Either way, the system can only move 256 bits at a time. If the CPU needs some of that bandwidth, then it can't be used by the GPU. Hence contention, and lowered GPU bandwidth. On PS4, it seems that somehow this process of CPU access lowered GPU bandwidth by more than just what the CPU was taking. This is out of my depth, so I'm not sure either why that is, or whether it might have been fixed for PS5.

Which is ultimately why it can't reach clockspeeds as high.

What you and many others miss is the scenarios Mr. Cerny referred to as being the usual absolute maximum power draw. We're used to thinking of power draw increasing as game complexity and visual ambition rise. This is true. But past a certain point, that very complexity starts to make games inefficient. Parallelization means work dependent on other unfinished work stalls, which is why you see framerates or IQ drop in demanding sections. Counter-intuitively, the very highest power usage is when complex games run simpler loads. The engine has been optimized to shove as much work through as possible, so when the work is simple--like a menu screen--there are no stalls. Every single transistor ends up in use, and power peaks.

It doesn't need to. Even at a lower clockspeed, there's more than enough compute available to render the menu. But in typical thermal design with fixed clocks, you don't have a choice. The temperature a 100% saturated menu causes at 1.9GHz might match what 95% saturated gameplay draws at 2.2GHz. (Illustrative only, not real data.) But this means a cooling system good enough to cover either situation can't support a 2.2GHz fixed clock. Because a 100% saturated GPU at 2.2GHz will produce too much heat, and you can't have the console shutting off to avoid overheating whenever a player brings up the menu.

Here's where Sony's solution comes in. Remember, it adjusts clock based on the activity of the chip, not temperature. So when the chip becomes 100% saturated by a menu screen or the like, clock goes down. Performance also reduces, but it doesn't matter because you're rendering very little. When the user goes back into gameplay, saturation reduces to 95% and the clock can rise back to 2.2GHz. This is how a cooling system that can't handle sustained 2.0GHz can reach 2.2GHz with Sony's variable approach. Note that all this same logic applies to the CPU as well: due to inherent inefficiencies it should be able to sustain at 3.5GHz in game situations, and only drop with simple workloads that won't be affected.

What about when games hit higher than 95% saturation, though? It appears that Sony's profiling tools suggest this is more likely on the GPU than CPU side, so AMD SmartShift then will allow reduction of CPU provisioning to allow 96% on GPU. There's still a limit, though. Mr. Cerny did acknowledge that at some point, there will come a game that hits 97 or 98% efficiency in gameplay, on both CPU and GPU at the same time. Here, the clock must be reduced, and that game will not have 10.3TF of performance.

But it should take some time (years?) for developers to get to grips with the platform such that they deftly avoid bottlenecks to this extent (such as fully using SMT). By that point, they'll also have come up with less intensive methods to accomplish some of the same shading, so the games will still look better than titles from early in the gen, even though the compute resources are slightly less. And the talk emphasized "slightly". Because of power curves, every percentage point you drop the clockspeed you lower power draw by multiple percent.

All this complexity could be eliminated by just putting in a cooling system capable of handling 100% saturation at 2.2GHz. But this must be huge and prohibitively expensive, because neither platform attempted it. Series X has a novel wind-tunnel design, a huge fan, and a heatsink that takes up a lot of the interior volume, and can't reach even 2GHz fixed. (Neither could Sony, by their own account.)

I feel like I owe a tuition fee after reading this. Very informative.

Pheonix · Mar 26, 2020

Jekked said:
I thought that TFLOPS don't matter that much.

So, I still don't get the point of all the overclocking to nightmare level with variable frequencies. Like why?
Cerny says it's only dropping for a couple of percent and only sometimes. Ok.
If it's only a couple of percent, like 2%, why not lock the frequencies?
TFLOPS don't matter anymore and according to Sony it would only to go down roughly 2%... so lock it 2% below.
spend less on cooling and make it way easier for devs.
Why sacrifice so much just for 2% more?

I have the feeling that it actually goes down way more than "just 2%", so that the TFLOPS gain - on paper - is much more than just 2%.
And Sony really wanted to close the TFLOPs GAP. 9 TFLOPS vs 12 TFLOPS just sounds horrible from a marketing perspective
Single digit vs double digit, but 10 vs 12 sounds much better.
And it might have been done very late at this stage, that would explain why we haven't seen the console yet AND why sony talked so late about the ps5 at all. With the PS4, they already showed almost everything by then.
anyway, we will see how it turns out.

I guess this s the new narrative now.

At this point, there s no need to even address comments like these.

Kemono said:
So outside of exclusive games nobody will use the extra power the x-series brings. And even exclusive games won't do that for a while thanks to their forced xbox one compability...

Can't wait to see the exclusives that use the full power. Hopefully Hellblade 2 will bring the heat.

Its not that no one would use the extra power. It's about how that extra power would present and what it translates to.

Its a simple case of either higher native rez, or a higher framerate. Not both. And the logical thing for any dev to do is to look the framerate and dynamically scale the rez. What that translates to is that the PS5 would on average rn at scaled-down rez from the XSX about 15-20% of the time. While everything else remains exactly the same.

And while I and many others have repeated this, that means that The XSX could be at 2160p@(Y)fps (where Y is whatever fps lock the devs set out for) 100% of the time and the PS5would be at 2160p@(Y)fps around 80%-85% of the time. All effects and assets would be the exact same thing with the only difference being that when that part of the game comes along that is taxing on the hardware, the XSX would likely be able to hold its rez @2160p while the PS5 would scale its rez down to around 1800p/2052p to maintain the locked framerate.

That's what this nonsense is about. 2160p vs 1800p/2052p - 2160p.

What makes all this even crazier is that with those kinda rez targets/range,1800p - 2160p ; 95% of the people out there wouldn't be able to detect or tell apart. Its actually easier to see a difference between 1080p vs 900p (around a 40%+ rez difference), than it is to see one between 1440p vs 2160p. Even though the latter is over a 100% resolution difference. And that's simply because when pixel count is already that high diminishing returns kicks in and it starts getting really hard to tell resolutions apart. And yet such a fuss is being made about 1800p/2052p vs 2160p. Even crazier when you consider that those drops in rez on the PS5wouldonly really happen about 15-20% of the time.

Crazy times.

NX Gamer: PS5 Full Spec Analysis | A new generation is Born

User requested account closure

Prophet of Truth

Prophet of Truth

User requested account closure

Professionally Enhanced

Alt Account

Account closed at user request

▲ Legend ▲

Account closed at user request