PlayStation 5 System Architecture Deep Dive |OT| Secret Agent Cerny

SharpX68K · May 24, 2020

I am wondering if we'll ever go back to discrete CPU and GPU in consoles again. PS2, PS3, XBOX, Xbox 360. Probably not in that sense, but probably 2.5D/3D chiplets or MCM There has go to be a point in the future where it makes sense to break up the computing elements and not have a monolithic SoC / APU.
Doing this should help yields and also allow greater performance than what would be possible on a single chip.

FF Seraphim · May 24, 2020

sleepr said:
Saved the best for the 2nd part, it will probably take a couple of days though.

I do not think this will only be 2 parts, he is only 14 minutes into the dive unless part two is over an hour. I like the fact he is able to explain a lot of this in laymen's terms as well.

TheThreadsThatBindUs · May 24, 2020

SharpX68K said:
I am wondering if we'll ever go back to discrete CPU and GPU in consoles again. PS2, PS3, XBOX, Xbox 360. Probably not in that sense, but probably 2.5D/3D chiplets or MCM There has go to be a point in the future where it makes sense to break up the computing elements and not have a monolithic SoC / APU.
Doing this should help yields and also allow greater performance than what would be possible on a single chip.

I do wonder whether the savings on chip yield for discrete dies will offset the additional advanced packaging costs.

I suspect that if we ever do go back to discrete chips, it may be for different reasons, e.g. having enough die edge real estate for I/O+memory controllers to facilitate the bandwidth requirements of the processors.

EBomb · May 24, 2020

Take a 36 CU RDNA2 GPU, put it under full load (i.e., high power consumption workload), graph power consumption vs. frequency, estimate non GPU power consumption in the PS5 APU, determine targeted power level of PS5, determine likely PS5 frequency for high power consumption workloads.

I continue to see comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2%, but this doesn't track. We don't know how much power throttling high power consumption workloads will require.

chris 1515 · May 24, 2020

EBomb said:
Take a 36 CU RDNA2 GPU, put it under full load (i.e., high power consumption workload), graph power consumption vs. frequency, estimate non GPU power consumption in the PS5 APU, determine targeted power level of PS5, determine likely PS5 frequency for high power consumption workloads.

I continue to see comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2%, but this doesn't track. We don't know how much power throttling high power consumption workloads will require.

And you don't know... And no one says this. We just don't know. Another random user who doesn't know. Next one. Do you have a devkit? In 6 months we will have a first idea.

Benny Del Toro · May 24, 2020

EBomb said:
Take a 36 CU RDNA2 GPU, put it under full load (i.e., high power consumption workload), graph power consumption vs. frequency, estimate non GPU power consumption in the PS5 APU, determine targeted power level of PS5, determine likely PS5 frequency for high power consumption workloads.

I continue to see comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2%, but this doesn't track. We don't know how much power throttling high power consumption workloads will require.

Those comments were based from Cerny himself IIRC, so we might not know but certainly Cerny might have an idea right?

chris 1515 · May 24, 2020

The topic is not interesting at all because the answer will arrive pretty soon and it will be fact. And the answer will not be yes but maybe. Does the console work as Mark Cerny told during the " road to PS5" video? Yes or no.

EBomb · May 24, 2020

Benny Del Toro said:
Those comments were based from Cerny himself IIRC, so we might not know but certainly Cerny might have an idea right?

I have no idea what you are trying to say. I quoted Cerny in my post. Of course his comments are accurate, and of course he knows the answer to the question I posed, but he didn't answer this question in his talk. We don't know if high power consumption workloads require a 25% power drop (from the uncapped peak, I don't need a lecture on Sony's power solution), and we also don't know how much the power curve is reduced by say a 10% frequency drop, because we can't draw a linear relationship from the metric Cerny cited.

TheRaidenPT · May 24, 2020

chris 1515 said:
The topic is not interesting at all because the answer will arrive pretty soon and it will be fact. And the answer will not be yes but maybe. Does the console work as Mark Cerny told during the " road to PS5" video? Yes or no.

I assume everything announced there will be as it will end up being. Unless they change their minds for a 399 tag price and reduce something or have 2 SKUs.. (storage)

CanisMajoris · May 24, 2020

chris 1515 said:
The topic is not interesting at all because the answer will arrive pretty soon and it will be fact. And the answer will not be yes but maybe. Does the console work as Mark Cerny told during the " road to PS5" video? Yes or no.

...Yes? why is there a doubt unless you consider Cerny to be a deceptive liar?

VanWinkle · May 24, 2020

EBomb said:
I have no idea what you are trying to say. I quoted Cerny in my post. Of course his comments are accurate, and of course he knows the answer to the question I posed, but he didn't answer this question in his talk. We don't know if high power consumption workloads require a 25% power drop (from the uncapped peak, I don't need a lecture on Sony's power solution), and we also don't know how much the power curve is reduced by say a 10% frequency drop, because we can't draw a linear relationship from the metric Cerny cited.

That is true, but in addition to the two percent downclock resulting in a 10% power savings, he added something to the effect of, "so any downclocks would be very minor."

chris 1515 · May 24, 2020

CanisMajoris said:
...Yes? why is there a doubt unless you consider Cerny to be a deceptive liar?

Again I prefer to wait and see. Do I think Cerny is a liar? No but the best is to wait and see. The debate is not interesting because I don't have a devkit. Any opinion is not interesting without devkit. This is just an opinion.

In 6 months we will have proof of how the PS5 and XSX perform in multiplatform games before this is just faith.

chris 1515 · May 24, 2020

Teraflops vs. Performance — The Difficulties of Comparing Videogame Consoles

Blast Processing is still around, I see.

medium.com

All people need to read this article by a dev. This is much more difficult to compare because the console are not so far from each other. If the XSX was 50% more powerful with 50% more bandwidth, it would be easy to compare without games but it is not the case.

Deleted member 1589 · May 24, 2020

CrispyGamer said:
When it comes to cross-gen I'm a bit more optimistic than most i actually expect those games to shine on next gen at 4k 120hz , I suspect that cross-gen and Indies are the only way we'll see 120hz games.

sports games, racing games is my guess to have 120hz modes.

RoboPlato · May 24, 2020

Do we think that the console CPUs are likely the same variant as the new laptop Zen 2s with the faster Infinity Fabric to help compensate for the smaller cache?

Lady Gaia · May 24, 2020

Pheonix said:
A question on memory bandwidth.

Not quite sure if this is how it works.

Let's look at a system (PS5) with say 14GB of RAMavailble to devs and a peak bandwidth of 480GB/s. Now its my understanding that that 14GB of RAM is kinda split into two parts. What I call an active pool and a passive pool. The passive pool basically is a cache for the active pool, containing all the data/assets used to render the frame that sits in the active pool. I am sure there is a better or more accurate depiction of this butthats kinda what I call it.

There's nothing quite so formal about a divide between areas of memory that are being actively used for various purposes and those that aren't. A developer can make use of it however they want, but it's largely a uniform address space that contains information that can be accessed with varying frequency. If you were to chart how often different addresses are accessed you'd probably find more of an exaggerated logarithmic curve - some small portion that's accessed all the time and then a long tail of progressively less frequent use - rather than a bi-modal divide.

Now here's where my question on bandwidth comes in. Let's say the active pool takes up around 5GB of that 14GB of RAM. So basically, 9GB of RAM is used to feed the 5GB. And lets further say of that 5GB, 4GB is VRAM for all the GPU stuff and 1GB is CPU RAM.

If targeting a 60fps refresh rate, doesn't that mean you would need to keep that 1GB and 4GB of CPU and GPU RAM respectively, updated 60 times a second?

Nothing nearly that straightforward. I would discourage you from pursuing this way of thinking about bandwidth because it doesn't bear much resemblance to what actually determines bandwidth requirements. There are relatively few people who are actively engaged in development that understand the implications of their own coding decisions on bandwidth usage much less system-wide bandwidth usage, so unless you plan to really dig in deep, write a bunch of code, and profile how it runs on real-world hardware, it's probably not worth trying to worry about in much more depth than "more is better."

Sorry about what might come across as a dismissive response, but it's not a simple topic.

Benny Del Toro · May 24, 2020

EBomb said:
I have no idea what you are trying to say. I quoted Cerny in my post. Of course his comments are accurate, and of course he knows the answer to the question I posed, but he didn't answer this question in his talk. We don't know if high power consumption workloads require a 25% power drop (from the uncapped peak, I don't need a lecture on Sony's power solution), and we also don't know how much the power curve is reduced by say a 10% frequency drop, because we can't draw a linear relationship from the metric Cerny cited.

Well you only said you keep seeing comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2% I don't think it was a stretch to say that it seemed you were referring to comments in this thread and not quoting Cerny himself. Did you mean they were quoting Cerny on the 2% frequency drop and then inferring from that statement that the frequency wouldn't drop much beyond 2%?

The 2% drop in frequency he gave was for his estimated worst case scenario, could there be a case where a much worse scenario can occur? Maybe but for now we have his estimate that any downclocking will be minor.

Wace · May 24, 2020

What we need is DF and some devs united for testing. They need to create specific power-hungry code and feed it to PS5 (and/or XSX) devkits and observe the results.

The question is what frequency will PS5 run on when XSX shuts down to prevent overheating.

Hybrid Rainbow · May 24, 2020

CanisMajoris said:
...Yes? why is there a doubt unless you consider Cerny to be a deceptive liar?

This was my point about the "skepticism". You have your doubts? Fine. However, don't try to come in here and blow smoke up our asses about subjects we've gone over ad nauseum under the guise of "well we really don't know if it's true or not" and then try to gaslight people as if they're the crazy ones for spotting the potential for concern trolling. I'm just ready for the full system to be revealed so all the BS can stop once and for all.

DrKeo · May 24, 2020

Pheonix said:
A question on memory bandwidth. Lady Gaia DrKeo anexanhume

Not quite sure if this is how it works.

Let's look at a system (PS5) with say 14GB of RAMavailble to devs and a peak bandwidth of 480GB/s. Now its my understanding that that 14GB of RAM is kinda split into two parts. What I call an active pool and a passive pool. The passive pool basically is a cache for the active pool, containing all the data/assets used to render the frame that sits in the active pool. I am sure there is a better or more accurate depiction of this butthats kinda what I call it.

Anyways, the passive pool is where the data that comes from the SSD is sent, the active pool is where all the data pertinent to each individual rendered frame resides. And pretty much gets updated on a frame by frame basis.

Now here's where my question on bandwidth comes in. Let's say the active pool takes up around 5GB of that 14GB of RAM. So basically, 9GB of RAM is used to feed the 5GB. And lets further say of that 5GB, 4GB is VRAM for all the GPU stuff and 1GB is CPU RAM.

If targeting a 60fps refresh rate, doesn't that mean you would need to keep that 1GB and 4GB of CPU and GPU RAM respectively, updated 60 times a second? So basically, you would need 60GB/s worth of memory bandwidth for the CPU and 240GB/s worth of bandwidth for the GPU?nd not that y need enough bandwidth to cover all 14GB of RAM because most of that RAM is only used as a cache anyways. So it's rather how quickly you can get data from that cache to actually render the frame?

I wrote a long response and then realized that you are talking about memory bandwidth, not feeding the memory with new data (ie the new console SSDs). You shouldn't look at the data in the memory as two pools but as one huge pool full of "hopefully relevant" data. The amount of data in there is much bigger than needed for the frame in the hope that each and every piece of data the GPU and CPU needs will be somewhere in that pool. Because storage today is terrible, that pool has to be big so it will hold as much of that "just in case" data as possible. So hopefully close to all of the 14GB will be relevant to the current frame now that we have SSDs, not just 5GB.

MS actually added a component to the X1X that monitors RAM usage so MS probably conducted the biggest RAM usage survey in history considering there are millions of X1X consoles out there playing 100% of current-gen games (excluding exclusives to other systems) for billions of hours of gameplay and from analyzing this data MS had found that current-gen games don't use 50%-67% of the available memory:

From this, we found a game typically accessed at best only one-half to one-third of their allocated pages over long windows of time, so if a game never had to load pages that are ultimately never actually used, that means a 2-3x multiplier on the effective amount of physical memory, and a 2-3x multiplier on our effective IO performance.

This means that adding an SSD to the PS4 would have been as effective as giving the PS4 10GB-15GB of memory available to developers.

Regarding reading data from memory, CPUs don't access memory much unlike GPUs which access it all the time. GPUs stream from memory constantly while if a CPU needs to access memory constantly, it will be a death sentence to the console's performance. I've never coded myself for a GPU, even my path tracer ran on a CPU (terribly 😊) so I guess I don't know the minutia well enough, but very generally speaking both GPU and CPU have loads of cache in order to not access the memory for every action they do. Even if all 14GB of data was relevant to the frame, it doesn't mean the CPU and GPU need the whole 14GB for the frame. For instance, the BVH structure for the coming frame has to be in memory and it will easily take over 1GB, but does it mean rays will traverse 100% of that structure? Does it mean the GPU will access 100% of the nodes in the structure? Not likely, it depends on the scene and type of rays, but it is highly unlikely. So it's not like the PS5 will need 60 x 14GB per second (840GB/s) in order to utilize 14GB of memory, it doesn't really work like that. But memory bandwidth is probably going to be both console's biggest bottleneck, especially now with RT, denoising, and ML. All three require a lot of memory bandwidth and will be very prolific this gen.

one · May 24, 2020

Back to doubting Cerny again after questioning Sweeney, what a ride. Unless we see something more impressive than the UE5 demo on a system without variable frequency, it is no more than FUD in the most boring way.

EBomb · May 24, 2020

Benny Del Toro said:
Well you only said you keep seeing comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2% I don't think it was a stretch to say that it seemed you were referring to comments in this thread and not quoting Cerny himself. Did you mean they were quoting Cerny on the 2% frequency drop and then inferring from that statement that the frequency wouldn't drop much beyond 2%?

The 2% drop in frequency he gave was for his estimated worst case scenario, could there be a case where a much worse scenario can occur? Maybe but for now we have his estimate that any downclocking will be minor.

Cerny didn't say that a 2% clock drop was a worst case scenario.

Deleted member 1589 · May 24, 2020

About Sweeney, all I know is I cant wait for the Unreal 5 tech demo post mortem/papers. I guess they decided to delay the release since it was supposed to be published last week.

Still think it was ran at a setting that is frankly, impossible for gaming unless we have a petabye SSD though, for marketing and system pushing reasons.

nib95 · May 24, 2020

EBomb said:
Cerny didn't say that a 2% clock drop was a worst case scenario.

The implication was that downclocks would be minor, as you get back a lot of power for such a small drop in frequency. Full quote.

"When that worst case arrives, it will run at a lower clockspeed, but not too much lower. To reduce power by 10%, it only takes a couple of percent reduction in frequency, so I'd expect any downclocking to be pretty minor"

EBomb · May 24, 2020

nib95 said:
The implication was that downclocks would be minor, as you get back a lot of power for such a small drop in frequency. Full quote.

"When that worst case arrives, it will run at a lower clockspeed, but not too much lower. To reduce power by 10%, it only takes a couple of percent reduction in frequency, so I'd expect any downclocking to be pretty minor"

Cerny didn't say that a 2% clock drop was a worst case scenario

anexanhume · May 24, 2020

EBomb said:
Cerny didn't say that a 2% clock drop was a worst case scenario

Cerny can't quote a worst case scenario because he can't predict the future. You need real games to quote hard numbers.

disco_potato · May 24, 2020

EBomb said:
I continue to see comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2%, but this doesn't track. We don't know how much power throttling high power consumption workloads will require.

Why not? If 2.23 is above the elbow, TDP reduction requires smaller clock reductions than lower in the curve. That was the argument against high clocks, wasn't it? And now sony seems to be using that to limit noticeable performance drops. The clock increase above the elbow returns less performance meaning it takes smaller clock adjustments to get the needed power draw reduction. Looking at your post from a few days ago, maybe this is the part you're not seeing.

As for high workloads, if the system has a power draw limit and both gpu and cpu peak clocks allow it to hit that limit, why would you intentionally exceed those given that the tools you have tell you you're doing so?

EBomb said:
Cerny didn't say that a 2% clock drop was a worst case scenario

No one is perfect but let's assume devs know what they're doing and that a theoretical 10% power draw "buffer" is plenty for them when they do make a mistake and QC doesn't catch it.

DrKeo · May 24, 2020

Flutta said:
About Sweeney, all I know is I cant wait for the Unreal 5 tech demo post mortem/papers. I guess they decided to delay the release since it was supposed to be published last week.

Still think it was ran at a setting that is frankly, impossible for gaming unless we have a petabye SSD though, for marketing and system pushing reasons.

What I want to know is how much space a game using raw assets like that takes.

nib95 said:
The implication was that downclocks would be minor, as you get back a lot of power for such a small drop in frequency. Full quote.

"When that worst case arrives, it will run at a lower clockspeed, but not too much lower. To reduce power by 10%, it only takes a couple of percent reduction in frequency, so I'd expect any downclocking to be pretty minor"

Cerny didn't say 2% is the worst-case scenario, he said that when the worst-case scenario comes, the PS5 won't need to drop clocks drastically because, as an example, dropping clocks by just a couple of percent already saves on 10% power. He also said that the PS5 couldn't run at 2Ghz with fixed clocks at their budget, which means there are instructions that will require the clocks to drop to around 2Ghz. It also makes sense because if dropping 2% saves 10% on power, it means that the PS5's APU is on the higher end of the power curve. Because clock-based power curves operate at the power of 2 (or even 3, depends on the source) It also means that if the first 2% drops the power by 10%, in order to achieve the next 10% the clocks will need to drop by a lot than 2% more.

Obviously Cerny had also said that PS5 spends most of its' time at the full 2.23Ghz. So if you want to take everything Cerny had said and put it into context, it should be something in the vein of - PS5's GPU will run almost all of the time at or close to 2.23Ghz, and some times when certain instructions come along it will drop lower to something closer to 2Ghz for a fraction of a second and then continue operating at 2.23Ghz. At least that's what I got from the whole variable clocks speech.

EBomb · May 24, 2020

anexanhume said:
Cerny can't quote a worst case scenario because he can't predict the future. You need real games to quote hard numbers.

No argument from me on this point.

nib95 · May 24, 2020

EBomb said:
Cerny didn't say that a 2% clock drop was a worst case scenario

He didn't specifically state as much, but he did imply the worst case should be around that sort of level or "pretty minor".

I'm guessing he didn't specify the exact percentage drop because they couldn't realistically forsee what sort of extreme outlier power spike any of the thousands of games that will release in future may or may not have.

anexanhume · May 24, 2020

EBomb said:
No argument from me on this point.

nib95 said:
He didn't specifically state as much, but he did imply the worst case should be around that sort of level.

I'm guessing he didn't specify the exact percentage drop because they couldn't realistically forsee what sort of extreme outlier power spike any of the thousands of games that will release in future may or may not have.

The implicit relationship that may be missed here is that the processor is downclocking because the game in question is particularly good at utilizing clock cycles. That will continue to be true even at a lower clock speed. Overall, it's more efficient to use less clock cycles at a higher utilization because you can do so at a lower voltage and dynamic power.

As long as the frequency curve they've constructed is linear or sub-linear, it will always make sense to get everything you can out of a clock cycle, even if that means you'll be downclocked. An example that suggests a mere 2% downclock can save 10% power suggests that curve is indeed sub-linear, so developers will never be punished for writing more efficient code.

This algorithm is there to help them get the absolute most out of the SoC. That's true in general for the console and tools philosophy in general.

Can't optimize your code? No problem, you'll enjoy max clocks. Don't have time to scale your models? No problem, the engine will manage LOD. Too expensive to pre-bake lighting? Engine has that too. Can't optimize your data placement and compression scheme? No worries - no need to duplicate data or balance compression with other CPU needs. This is a pervasive philosophy going forward.

DominusArmanius · May 24, 2020

one said:
Back to doubting Cerny again after questioning Sweeney, what a ride. Unless we see something more impressive than the UE5 demo on a system without variable frequency, it is no more than FUD in the most boring way.

Sacrificing teraflops for a proprietary SSD was a big miscalculation, IMO. To most people, Teraflops>SSD. I expect FUD all this gen until a possible PS5 Pro, and there's no guarantee a PS5P will be more powerful than a new XBox Series XX.

anexanhume · May 24, 2020

AlexiosAreimanios37 said:
To most people, Teraflops>SSD.

And by most people, you mean 0 developers? Plenty of developers have acknowledged the X had more FLOPs while stressing they care way more about the CPU and SSD jumps.

sleepr · May 24, 2020

AlexiosAreimanios37 said:
Sacrificing teraflops for a proprietary SSD was a big miscalculation, IMO. To most people, Teraflops>SSD. I expect FUD all this gen until a possible PS5 Pro, and there's no guarantee a PS5P will be more powerful than a new XBox Series XX.

Xbox has had the most powerful console on the market for 3 years now. Those sony guys never learn!

hey it's that dog · May 24, 2020

AlexiosAreimanios37 said:
Sacrificing teraflops for a proprietary SSD was a big miscalculation, IMO. To most people, Teraflops>SSD. I expect FUD all this gen until a possible PS5 Pro, and there's no guarantee a PS5P will be more powerful than an XBox Series XX.

It would be a miscalculation if we played PR messages instead of video games, yeah.

The talk about teraflops will die down if not cease entirely when we see some actual software, finally, running on both of these machines. The visual differences between multiplats will be harder than ever before to distinguish and digital foundry has stated the UE5 demo has rendered pixel counting obsolete. I'd expect things to move in that direction generally.

So what will people FUD about when we're actually playing the games and they're virtually identical?

chris 1515 · May 24, 2020

AlexiosAreimanios37 said:
Sacrificing teraflops for a proprietary SSD was a big miscalculation, IMO. To most people, Teraflops>SSD. I expect FUD all this gen until a possible PS5 Pro, and there's no guarantee a PS5P will be more powerful than a new XBox Series XX.

And your opinion does not matter, you don't have devkit. You know they believe they are right too.

BeeDog · May 24, 2020

Has anything been said about the power consumption of the next-gen consoles? I'm becoming much more mindful of the appliances at home since my electricity costs have been notable the last couple of months.

My honest hope is that the PS5's slightly weaker HW and focus on throttling actually is good from a power perspective.

DominusArmanius · May 24, 2020

chris 1515 said:
And your opinion does not matter, you don't have devkit. You know they believe they are right too.

lol.. But you have a point.

Dekim · May 24, 2020

Cerny must be one hell of a liar because developers--the people who will working with the hardware firsthand, if they aren't already--seems to be quite excited about the machine him and his team designed, especially in regard to the SSD.

MrKlaw · May 24, 2020

EBomb said:
Take a 36 CU RDNA2 GPU, put it under full load (i.e., high power consumption workload), graph power consumption vs. frequency, estimate non GPU power consumption in the PS5 APU, determine targeted power level of PS5, determine likely PS5 frequency for high power consumption workloads.

I continue to see comments reflecting that a 2% clock drop creates a 10% power savings, then stating that this means we shouldn't see much clock drops beyond 2%, but this doesn't track. We don't know how much power throttling high power consumption workloads will require.

Take a 36CU RDNA2 GPU - put it under an artificial benchmark like Prime95/Aida64 to max power usage - then graph it.

Now profile your first party and potentially third party PS4 game engines to understand GPU utilisation/saturation. According to Cerny even efficient game engines are only 40-50% loading CUs. Extrapolate that to get an equivalent loading on RDNA2, using information on architectural differences etc.

So you have a profile of power usage for an artificially highly saturated GPU - 80-90% utilisation with an artificial benchmark; and a simulated/estimated real world load based on proflling of game engines. That shoudl allow you to fairly accurately predict expected window of power usage for PS5 games, add a buffer in case engines go from 40-50%, to 50-60%. Design your cooling around that.

You'll still have spikes of utlisation like GoW's map screen, or TLoU's kitchen curtains - where you're hitting 80% utilisation but inefficiently. Thats where you'll get clocks dropping because you're exceeding the defined power window

anexanhume said:
Cerny can't quote a worst case scenario because he can't predict the future. You need real games to quote hard numbers.

Sony and MS can both predict the future with some margin of error. They'll have a huge amount of data from profiling tools etc and have a high level of confidence about likely % utilisation of CPU/GPU for various types of workload, which they can use to build models of how a future console is likely to perform with similar engines (and use as data to explore some hypothetical scenarios too)

EBomb · May 24, 2020

anexanhume said:
And by most people, you mean 0 developers? Plenty of developers have acknowledged the X had more FLOPs while stressing they care way more about the CPU and SSD jumps.

There can be differences in players and developers perspectives though. Marginal additional TF's are a known quantity that allow for specific benefits that aren't that interesting from an industry perspective. Some players may value these differences, many won't.

Avoiding the many memory management headaches associated with slow I/O as well as avoiding loading screen issues (gimmicks, corridors, etc.) may provide a huge workflow benefit for Developers that saves time and money. Of course they would value these things highly. I am happy for Developers here. But I think these things will likely impact their development experience far more than my gaming experiences.

DrKeo · May 24, 2020

So I popped into the Quixel megascan library and checked out some assets from the UE5 demo. This limestone from the demo has a 179MB displacement map, 30MB 8K texture and a 23MB roughness map. A total of 232MB.

This part of the cliff has a 173MB displacement map, 37MB 8K texture and a 26MB roughness map for a total of 236MB

This pile of small rocks and ground has a 175MB displacement map, 28MB texture, and 24MB roughness map for a total of 227MB.

It seems like most of the models I've checked out are at the same ballpark. I wonder how much the demo weighs or if they have a way to make it more efficient in order to conserve space.

chris 1515 · May 24, 2020

DrKeo said:
So I popped into the Quixel megascan library and checked out some assets from the UE5 demo. This limestone from the demo has a 179MB displacement map, 30MB 8K texture and a 23MB roughness map. A total of 232MB.

This part of the cliff has a 173MB displacement map, 37MB 8K texture and a 26MB roughness map for a total of 236MB

This pile of small rocks and ground has a 175MB displacement map, 28MB texture, and 24MB roughness map for a total of 227MB.

It seems like most of the models I've checked out are at the same ballpark. I wonder how much the demo weighs or if they have a way to make it more efficient in order to conserve space.

I think Brian Karis told there is some parallel between Dreams technology and UE 5 demo. Dreams on storage are tiny maybe something there. This is the most important part of UE5 technology...

DominusArmanius · May 24, 2020

I remember the 90s comparing consoles. Gotta get Genesis!

vivftp · May 24, 2020

Just a random question. When Cerny mentioned they had to cap the GPU frequency at 2.23 GHz to guarantee the on-chip logic operates properly, how locked in are they to that? It sounds like there's some potential to go higher, if not for the on-chip logic. Is this a hardware limitation where they 100% can't go any higher with this chip, or is there the potential for them to eventually come up with a way to higher before launch via software or some other tweek?

I'm sure the answer is no, but figured I'd ask anyways

DrKeo · May 24, 2020

chris 1515 said:
I think Brian Karis told there are some parallel between Dreams technology and UE 5 demo. Dreams on storage are tiny maybe something there.

Dreams also use a software rasterizer for micro-polygons because the user can create their own models, so the engine has to be able to handle that. That's what I think he meant, but I guess we will have to wait for their talk in order to know more.

Pheonix · May 24, 2020

Lady Gaia said:
There's nothing quite so formal about a divide between areas of memory that are being actively used for various purposes and those that aren't. A developer can make use of it however they want, but it's largely a uniform address space that contains information that can be accessed with varying frequency. If you were to chart how often different addresses are accessed you'd probably find more of an exaggerated logarithmic curve - some small portion that's accessed all the time and then a long tail of progressively less frequent use - rather than a bi-modal divide.

Nothing nearly that straightforward. I would discourage you from pursuing this way of thinking about bandwidth because it doesn't bear much resemblance to what actually determines bandwidth requirements. There are relatively few people who are actively engaged in development that understand the implications of their own coding decisions on bandwidth usage much less system-wide bandwidth usage, so unless you plan to really dig in deep, write a bunch of code, and profile how it runs on real-world hardware, it's probably not worth trying to worry about in much more depth than "more is better."

Sorry about what might come across as a dismissive response, but it's not a simple topic.

Thanks none the less. And I understand that its not a simple topic and surely nowhere near as simple as I was looking at it. But if nothing else I appreciate that you've set me straight lol. At least now I know that that way of looking at memory bandwidth is wrong.

DrKeo said:
I wrote a long response and then realized that you are talking about memory bandwidth, not feeding the memory with new data (ie the new console SSDs). You shouldn't look at the data in the memory as two pools but as one huge pool full of "hopefully relevant" data. The amount of data in there is much bigger than needed for the frame in the hope that each and every piece of data the GPU and CPU needs will be somewhere in that pool. Because storage today is terrible, that pool has to be big so it will hold as much of that "just in case" data as possible. So hopefully close to all of the 14GB will be relevant to the current frame now that we have SSDs, not just 5GB.

MS actually added a component to the X1X that monitors RAM usage so MS probably conducted the biggest RAM usage survey in history considering there are millions of X1X consoles out there playing 100% of current-gen games (excluding exclusives to other systems) for billions of hours of gameplay and from analyzing this data MS had found that current-gen games don't use 50%-67% of the available memory:

This means that adding an SSD to the PS4 would have been as effective as giving the PS4 10GB-15GB of memory available to developers.

Regarding reading data from memory, CPUs don't access memory much unlike GPUs which access it all the time. GPUs stream from memory constantly while if a CPU needs to access memory constantly, it will be a death sentence to the console's performance. I've never coded myself for a GPU, even my path tracer ran on a CPU (terribly 😊) so I guess I don't know the minutia well enough, but very generally speaking both GPU and CPU have loads of cache in order to not access the memory for every action they do. Even if all 14GB of data was relevant to the frame, it doesn't mean the CPU and GPU need the whole 14GB for the frame. For instance, the BVH structure for the coming frame has to be in memory and it will easily take over 1GB, but does it mean rays will traverse 100% of that structure? Does it mean the GPU will access 100% of the nodes in the structure? Not likely, it depends on the scene and type of rays, but it is highly unlikely. So it's not like the PS5 will need 60 x 14GB per second (840GB/s) in order to utilize 14GB of memory, it doesn't really work like that. But memory bandwidth is probably going to be both console's biggest bottleneck, especially now with RT, denoising, and ML. All three require a lot of memory bandwidth and will be very prolific this gen.

Thanks for this too. Ok.. so i get it now. It doesn't workthat way.

You see like you I have been feeling that memory bandwidth would be these consoles sore thumb, thats, why I wanted to have a better understanding of what exactly the bandwidth, is needed for and how its used. So far, how I understand it is that for every rendered frame, you need to move stuff from RAM into the cache or into the frame..etc. The more bandwidth you have, the faster/more stuff you can move.

chris 1515 · May 24, 2020

vivftp said:
Just a random question. When Cerny mentioned they had to cap the GPU frequency at 2.23 GHz to guarantee the on-chip logic operates properly, how locked in are they to that? It sounds like there's some potential to go higher, if not for the on-chip logic. Is this a hardware limitation where they 100% can't go any higher with this chip, or is there the potential for them to eventually come up with a way to higher before launch via software or some other tweek?

I'm sure the answer is no, but figured I'd ask anyways

The GPU is capped at 2.23 Ghz. It will never go higher.

Lady Gaia · May 24, 2020

vivftp said:
Just a random question. When Cerny mentioned they had to cap the GPU frequency at 2.23 GHz to guarantee the on-chip logic operates properly, how locked in are they to that? It sounds like there's some potential to go higher, if not for the on-chip logic. Is this a hardware limitation where they 100% can't go any higher with this chip, or is there the potential for them to eventually come up with a way to higher before launch via software or some other tweek?

I can't imagine they'd announce a clock speed and describe that the logic can't be pushed any higher without having pretty thorough validation. As described, the reason for not being able to clock higher is 100% a hardware logic issue and there's zero chance that'll change with software. Sony is locked and loaded when it comes to final APU design, and the window for doing anything else exotic like using faster memory has likely closed as well. If their contracted supplier suddenly said they'd had a breakthrough and could supply faster RAM at the same price, could they take that? Sure. Probably. I wouldn't bet on it.

vivftp · May 24, 2020

chris 1515 said:
The GPU is capped at 2.23 Ghz. It will never go higher.

I know it will not go higher due to the cap, but I was wondering if there was potential for them to raise the cap if they found a way to get around that particular limitation with the on-chip logic. Or is the on-chip logic such a fundamental limiter that nothing short of a redesign would let them go even higher?

PlayStation 5 System Architecture Deep Dive |OT| Secret Agent Cerny

Editor-in-Chief, Hyped Pixels

User requested account closure

User requested account closure

Contains No Misinformation on Philly Cheesesteaks

Contains No Misinformation on Philly Cheesesteaks

Banned for misusing pronouns feature