• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.
  • We have made minor adjustments to how the search bar works on ResetEra. You can read about the changes here.

modiz

Member
Oct 8, 2018
17,939
In my admittedly very layman understanding isn't that one of the core advantages of having a dedicated machine learning hardware disambiguated from CPU and GPU?

Otherwise if it takes 2ms of render budget and everything else has to wait for that then by using the module you are inherently reducing your performance by 12.5% for your upscale. Spending 12.5% of your render budget on upscaling seems like a lot AT 60 and yes that would double to 25% if render budget at 120.

Like in a 60fps game do the CPU and GPU use 14ms to render the frame, then sit idle for 2ms while they wait for PSSR to do its thing?

It would be far more efficient for the CPU and GPU to get on with rendering the next frame once they have passed over to PSSR whatever they need to rather than waiting. PSSR will likely delay each frame reaching your screen by 2ms, but it shouldn't extend the render time between each frame by 2ms if it's running in parallel as a dedicated die area would.




But FSR competes with rendering budget as it uses the GPU right?

PSSR doesn't use the GPU, it uses a machine learning unit. The question is why would that take up GPU rendering time if it isn't using the GPU?
We dont know whether the ML block is built into the GPU Compute Units or not.

Thing is, that the ML block's 300TOPs metric does not actually line up correctly with the rest of the PS5 Pro's GPU info (assuming the figure isnt rounded up from 268 but now it seems unlikely to me), so it woulf make sense if the ML block is not actually built into the CUs (but that would mean the APU is quite large).
 

tiscali

Member
Jul 14, 2022
116
We dont know whether the ML block is built into the GPU Compute Units or not.

Thing is, that the ML block's 300TOPs metric does not actually line up correctly with the rest of the PS5 Pro's GPU info (assuming the figure isnt rounded up from 268 but now it seems unlikely to me), so it woulf make sense if the ML block is not actually built into the CUs (but that would mean the APU is quite large).
The statement "You don't need the latest SDK for updating your games but you still get PSSR" sounds like a post-processing approach and nothing that is deeply embedded into the rendering pipeline. So yeah, it could be a hardware accelerator AFTER the classical pipeline.
 

Doctor Avatar

Member
Jan 10, 2019
2,654
We dont know whether the ML block is built into the GPU Compute Units or not.

Thing is, that the ML block's 300TOPs metric does not actually line up correctly with the rest of the PS5 Pro's GPU info (assuming the figure isnt rounded up from 268 but now it seems unlikely to me), so it woulf make sense if the ML block is not actually built into the CUs (but that would mean the APU is quite large).

I highly suspect the machine learning unit unit is going to be essentially the tempest engine 2.0.

Sony already have an area of custom CUs in PS5 for audio processing and the verge have also noted that the audio processing in PS5 Pro is better.

So they probably expanded the tempest engine from the 2CUs in PS5 and have modified them further to be better machine learning units.

Total speculation of course but it would line up with everything we know. Better audio, really really strong machine learning performance that does not make sense with our understanding of GPU CUs. PSSR can be added to games without even updating the SDK (render pipeline seemingly unaffected).

EDIT: didn't cerny insinuate when announcing the PS5 hardware that the tempest engine CUs could be used for more than just audio or am I misremembering? Maybe hinting at PSSR?
 
Last edited:

modiz

Member
Oct 8, 2018
17,939
The statement "You don't need the latest SDK for updating your games but you still get PSSR" sounds like a post-processing approach and nothing that is deeply embedded into the rendering pipeline. So yeah, it could be a hardware accelerator AFTER the classical pipeline.
Is that how it works with DLSS as well? I remember at some point NVidia added a similar parallel computatiom approach.
 

BreakAtmo

Member
Nov 12, 2017
12,967
Australia
In my admittedly very layman understanding isn't that one of the core advantages of having a dedicated machine learning hardware disambiguated from CPU and GPU?

Otherwise if it takes 2ms of render budget and everything else has to wait for that then by using the module you are inherently reducing your performance by 12.5% for your upscale. Spending 12.5% of your render budget on upscaling seems like a lot AT 60 and yes that would double to 25% if render budget at 120.

Like in a 60fps game do the CPU and GPU use 14ms to render the frame, then sit idle for 2ms while they wait for PSSR to do its thing?

It would be far more efficient for the CPU and GPU to get on with rendering the next frame once they have passed over to PSSR whatever they need to rather than waiting. PSSR will likely delay each frame reaching your screen by 2ms, but it shouldn't extend the render time between each frame by 2ms if it's running in parallel as a dedicated die area would.




But FSR competes with rendering budget as it uses the GPU right?

PSSR doesn't use the GPU, it uses a machine learning unit. The question is why would that take up GPU rendering time if it isn't using the GPU?

Again I might be wrong here but I was under the impression the entire point of having a dedicated machine learning unit like tensor cores or what Sony are doing with PSSR is that it allows the dedicated hardware to handle the reconstruction without using up GPU resources.

So, interestingly, the concept of doing DLSS upscaling of one frame concurrently with the rendering of the next frame was confirmed as part of the Ampere featureset by NVIDIA back in 2020, but it's never been used on PC, probably because the cost is frequently too minor for the devs to bother with it, and weaker cards that take longer to upscale will often target lower resolutions anyway. In a console, though, especially one always targeting 4K, it makes way more sense. I'm hoping we see it in Switch 2 as well. We don't know if this concurrency is also a PSSR feature, but I would like to hope that it's possible. An extra 2ms of input lag isn't the worst thing in the world - especially if you get a better framerate that lowers the lag in turn.
 

BreakAtmo

Member
Nov 12, 2017
12,967
Australia
2ms, even for people obsessed with lowering input lag is nothing for the benefit of much better fps.

Bingo - I remember reading that God of War 2018 running at 60fps on PS5 has something like 120ms. This whole subject came up on some discussions about the Switch 2 where the cost of DLSS 2160p could be like 10-15ms - awful as part of the frame render but much more viable as input lag. The PS5 Pro wouldn't need it as much, but it would absolutely still benefit, especially when trying to hit 120fps.
 

tiscali

Member
Jul 14, 2022
116
Is that how it works with DLSS as well? I remember at some point NVidia added a similar parallel computatiom approach.
Yeah sorry, I mixed those upscalers with the frame generation stuff. For the (spatial) upscalers they use a trained autoencoder (16K/8K/4K -> Native resolution -> Target resolution), put the kernel it the graphics drivers and just apply the multilayer filtering (CNN) to reconstruct the image.
For the frame-generation stuff they need motion vectors, etc. to interpolate the inter-frames correctly, therefore you need information extracted within the rendering pipeline.
 

2Blackcats

Member
Oct 26, 2017
16,164
I highly suspect the machine learning unit unit is going to be essentially the tempest engine 2.0.

Sony already have an area of custom CUs in PS5 for audio processing and the verge have also noted that the audio processing in PS5 Pro is better.

So they probably expanded the tempest engine from the 2CUs in PS5 and have modified them further to be better machine learning units.

Total speculation of course but it would line up with everything we know. Better audio, really really strong machine learning performance that does not make sense with our understanding of GPU CUs. PSSR can be added to games without even updating the SDK (render pipeline seemingly unaffected).

EDIT: didn't cerny insinuate when announcing the PS5 hardware that the tempest engine CUs could be used for more than just audio or am I misremembering? Maybe hinting at PSSR?

www.eurogamer.net

PlayStation 5 uncovered: the Mark Cerny tech deep dive

On March 18th, Sony finally broke cover with in-depth information on the technical make-up of PlayStation 5. Expanding …

"The Tempest engine itself is, as Cerny explained in his presentation, a revamped AMD compute unit, which runs at the GPU's frequency and delivers 64 flops per cycle. Peak performance from the engine is therefore in the region of 100 gigaflops, in the ballpark of the entire eight-core Jaguar CPU cluster used in PlayStation 4. While based on GPU architecture, utilisation is very, very different.

"GPUs process hundreds or even thousands of wavefronts; the Tempest engine supports two," explains Mark Cerny. "One wavefront is for the 3D audio and other system functionality, and one is for the game. Bandwidth-wise, the Tempest engine can use over 20GB/s, but we have to be a little careful because we don't want the audio to take a notch out of the graphics processing. If the audio processing uses too much bandwidth, that can have a deleterious effect if the graphics processing happens to want to saturate the system bandwidth at the same time."

Essentially, the GPU is based on the principle of parallelism - the idea of running many tasks (or waves) simultaneously. The Tempest engine is much more serial-like in nature, meaning that there's no need for attached memory caches. "When using the Tempest engine, we DMA in the data, we process it, and we DMA it back out again; this is exactly what happens on the SPUs on PlayStation 3," Cerny adds. "It's a very different model from what the GPU does; the GPU has caches, which are wonderful in some ways but also can result in stalling when it is waiting for the cache line to get filled. GPUs also have stalls for other reasons, there are many stages in a GPU pipeline and each stage needs to supply the next. As a result, with the GPU if you're getting 40 per cent VALU utilisation, you're doing pretty damn well. By contrast, with the Tempest engine and its asynchronous DMA model, the target is to achieve 100 percent VALU utilisation in key pieces of code." "

Sorry for long quote, I thought it all might be relevant .
 

Doctor Avatar

Member
Jan 10, 2019
2,654
www.eurogamer.net

PlayStation 5 uncovered: the Mark Cerny tech deep dive

On March 18th, Sony finally broke cover with in-depth information on the technical make-up of PlayStation 5. Expanding …

"The Tempest engine itself is, as Cerny explained in his presentation, a revamped AMD compute unit, which runs at the GPU's frequency and delivers 64 flops per cycle. Peak performance from the engine is therefore in the region of 100 gigaflops, in the ballpark of the entire eight-core Jaguar CPU cluster used in PlayStation 4. While based on GPU architecture, utilisation is very, very different.

"GPUs process hundreds or even thousands of wavefronts; the Tempest engine supports two," explains Mark Cerny. "One wavefront is for the 3D audio and other system functionality, and one is for the game. Bandwidth-wise, the Tempest engine can use over 20GB/s, but we have to be a little careful because we don't want the audio to take a notch out of the graphics processing. If the audio processing uses too much bandwidth, that can have a deleterious effect if the graphics processing happens to want to saturate the system bandwidth at the same time."

Essentially, the GPU is based on the principle of parallelism - the idea of running many tasks (or waves) simultaneously. The Tempest engine is much more serial-like in nature, meaning that there's no need for attached memory caches. "When using the Tempest engine, we DMA in the data, we process it, and we DMA it back out again; this is exactly what happens on the SPUs on PlayStation 3," Cerny adds. "It's a very different model from what the GPU does; the GPU has caches, which are wonderful in some ways but also can result in stalling when it is waiting for the cache line to get filled. GPUs also have stalls for other reasons, there are many stages in a GPU pipeline and each stage needs to supply the next. As a result, with the GPU if you're getting 40 per cent VALU utilisation, you're doing pretty damn well. By contrast, with the Tempest engine and its asynchronous DMA model, the target is to achieve 100 percent VALU utilisation in key pieces of code." "

Sorry for long quote, I thought it all might be relevant .

I also thought it a bit strange that Sony would spend so much engineering time on audio. Developing their own bespoke variant of CUs seems like a lot of work.

Now if this was the initial work for PSSR that makes a whole lot more sense. You shed your audio from the CPU (which did it in PS4) and you shed your reconstruction from the GPU (which does it in PS4 and PS5).

I might be well off base but I think it would be most logical for PSSR to be handled by an expanded tempest engine. Maybe Sony started developing it because they saw on the AMD roadmap they wouldn't have AI reconstruction so worked on their own solution.
 

15SagittaeB

Member
Feb 12, 2022
922
not sure why people only talk about 30 or 60 fps, when there is a perfectly viable happy middleground in 40fps

if base PS5 only delivers 30fps, there is a good chance PS5 pro can do 40fps
 

shadowman16

Member
Oct 25, 2017
32,304
not sure why people only talk about 30 or 60 fps, when there is a perfectly viable happy middleground in 40fps

if base PS5 only delivers 30fps, there is a good chance PS5 pro can do 40fps
Basically we just need to get to a point where TVs that support 40FPS have been widely adopted. Honestly not sure where we are at with that, but I personally still have my old Plasma TV, so no shiny new refresh rates/40FPS for me!

It would be great for games to focus on 40FPS as an additional target though. Obviously I dont wanna see 60FPS dropped, but its a great middle ground between graphics and smoother performance.
 

modiz

Member
Oct 8, 2018
17,939
Yeah sorry, I mixed those upscalers with the frame generation stuff. For the (spatial) upscalers they use a trained autoencoder (16K/8K/4K -> Native resolution -> Target resolution), put the kernel it the graphics drivers and just apply the multilayer filtering (CNN) to reconstruct the image.
For the frame-generation stuff they need motion vectors, etc. to interpolate the inter-frames correctly, therefore you need information extracted within the rendering pipeline.
NVidia uses motion vectors for DLSS (2+), its a temporal upscaler, PSSR will be too.
www.eurogamer.net

PlayStation 5 uncovered: the Mark Cerny tech deep dive

On March 18th, Sony finally broke cover with in-depth information on the technical make-up of PlayStation 5. Expanding …

"The Tempest engine itself is, as Cerny explained in his presentation, a revamped AMD compute unit, which runs at the GPU's frequency and delivers 64 flops per cycle. Peak performance from the engine is therefore in the region of 100 gigaflops, in the ballpark of the entire eight-core Jaguar CPU cluster used in PlayStation 4. While based on GPU architecture, utilisation is very, very different.

"GPUs process hundreds or even thousands of wavefronts; the Tempest engine supports two," explains Mark Cerny. "One wavefront is for the 3D audio and other system functionality, and one is for the game. Bandwidth-wise, the Tempest engine can use over 20GB/s, but we have to be a little careful because we don't want the audio to take a notch out of the graphics processing. If the audio processing uses too much bandwidth, that can have a deleterious effect if the graphics processing happens to want to saturate the system bandwidth at the same time."

Essentially, the GPU is based on the principle of parallelism - the idea of running many tasks (or waves) simultaneously. The Tempest engine is much more serial-like in nature, meaning that there's no need for attached memory caches. "When using the Tempest engine, we DMA in the data, we process it, and we DMA it back out again; this is exactly what happens on the SPUs on PlayStation 3," Cerny adds. "It's a very different model from what the GPU does; the GPU has caches, which are wonderful in some ways but also can result in stalling when it is waiting for the cache line to get filled. GPUs also have stalls for other reasons, there are many stages in a GPU pipeline and each stage needs to supply the next. As a result, with the GPU if you're getting 40 per cent VALU utilisation, you're doing pretty damn well. By contrast, with the Tempest engine and its asynchronous DMA model, the target is to achieve 100 percent VALU utilisation in key pieces of code." "

Sorry for long quote, I thought it all might be relevant .
If its made to be serializable then it wouldnt work well as a ML hardware. Deep Learning models do Matrix Multiplications and those run well with high parallelization.
 

BreakAtmo

Member
Nov 12, 2017
12,967
Australia
not sure why people only talk about 30 or 60 fps, when there is a perfectly viable happy middleground in 40fps

if base PS5 only delivers 30fps, there is a good chance PS5 pro can do 40fps

Because 40fps 120hz modes have been around for years now and the vast majority of developers still seemingly refuse to use them, even when they would be tremendously valuable (see various recent Final Fantasy and UE5 titles). 30 and 60fps modes remain the common standard, including in every game with a 40fps mode, so obviously we're going to focus on them - even if I wish 40fps was used more. One reason why I wanted a bigger CPU upgrade was so the Pro could guarantee a 40fps mode in almost any game would at least be possible, but oh well.
 

Zep

Member
Jul 12, 2021
1,486
Because 40fps 120hz modes have been around for years now and the vast majority of developers still seemingly refuse to use them, even when they would be tremendously valuable (see various recent Final Fantasy and UE5 titles). 30 and 60fps modes remain the common standard, including in every game with a 40fps mode, so obviously we're going to focus on them - even if I wish 40fps was used more. One reason why I wanted a bigger CPU upgrade was so the Pro could guarantee a 40fps mode in almost any game would at least be possible, but oh well.
A 40fps mode in the vast majority of games is possible now without a cpu upgrade…Nothing would ever be guaranteed with devs, even with the cpu upgrade. But Sony devs have done a great job this gen doing it so far.
 

BreakAtmo

Member
Nov 12, 2017
12,967
Australia
because nobody has a TV capable of displaying 40fps, everyone has a TV that can do 30 or 60

This is a massive exaggeration in the other direction. 120hz TVs have been around for years, millions of people own then. Even just LG OLED TVs like the CX I got 3 years ago have sold in the multiple millions, and they target exactly the sorts of enthusiasts who buy consoles early.

A 40fps mode in the vast majority of games is possible now without a cpu upgrade…Nothing would ever be guaranteed with devs, even with the cpu upgrade. But Sony devs have done a great job this gen doing it so far.

In the vast majority of games, yes, but I would've liked to see a CPU boost of about 33%, so that even in the worst-case scenario, a hypothetical PS5 game that was CPU-bound at 30fps could do 40 on the Pro. I guess we'll just have to hope that devs try to do more 40fps modes on the base consoles.
 

Tony72495

Member
Apr 26, 2019
348
We dont know whether the ML block is built into the GPU Compute Units or not.

Thing is, that the ML block's 300TOPs metric does not actually line up correctly with the rest of the PS5 Pro's GPU info (assuming the figure isnt rounded up from 268 but now it seems unlikely to me), so it woulf make sense if the ML block is not actually built into the CUs (but that would mean the APU is quite large).

Well the weird thing about RDNA 3 is that it features tensor-like cores, I think the full 7900 XTX has 192 of them. But they don't do anything currently. Maybe a future version of FSR could make use of them, but I think they're just sitting there inactive for now.
 

dgrdsv

Member
Oct 25, 2017
12,031
FFVII: Rebirth resolution was 1152p in performance. Sure poorly optimized games will struggle. But we'll see massive improvements in performance modes which will look light years ahead of what we're seeing now in performance modes.
If by "poorly optimized" games you mean games which are aiming at 30 fps by design then yeah, these will "struggle" with going above that, and their number will be rising.

Obviously its not a ps6 but there seems to be a lot downplaying the reality of a increase in 60fps games running at much higher resolutions even if they're not technically "true 4k."
There is no downplaying of anything.
This generation has gotten a lot of 60 fps games so far because it got a) a very long cross-gen period with games being designed for PS4/XBO (or even Switch) baselines and b) better resolution reconstruction approaches which allow somewhat okayish reconstruction from down to 1/4th of display resolution.
Both of these are out of the picture once we're past cross-gen as both depend on the baseline h/w target. Expecting everyone to just continue targeting PS4/XBO graphics for some reason is what's "downplaying reality" here.

This is the part people are doubting, because a lot of this also has to do with consumer demand. If 60fps was really not important, we would have seen it going away by now.
We are seeing it because of h/w targets and when we get a game which didn't target XBO we're not getting anything close to a stable usable 60 either.
From my casual viewing of DF analysis over the last year or so it seem like the vast majority of current gen releases aren't hitting anything close to a locked stable 60 anymore.

The same reason 120fps or at least hz modes are being added into mp games, because there is demand for it beyond just niche enthusiasts.
Yep, same reason: these games are/were made for previous gen h/w.
 

PLASTICA-MAN

Member
Oct 26, 2017
23,948
Having PSSSR being better than FSR 2 is not a fetat at all to brag about. Everythign is better than FSR 2 so if they claim victory for being better than the worst thing ever created they really should push the things much much more before release and not rely on future iterations of PSSSR to solve things out.
 
Last edited:

RoboPlato

Member
Oct 25, 2017
6,826
I hope PSSR can be around what the 1.3 version of XeSS on the XMX path in terms of quality. Big reduction of ghosting, nice temporal stability, and significantly fewer artifacts on water and still with a nice, clean overall IQ

It hits all of the things that have been standing out to me in many recent games
 

PLASTICA-MAN

Member
Oct 26, 2017
23,948
Can you summarize, not many people on this forum speak French.

Not much more than what i said. He is just discussing the the verge article talking about the new bosot mode but he said he spoke to many devs from many studios who showed him things (mostly games) and he insinuated that you will be surprised. I am sure he can't say more cus he may fear the MLID case which he discussed too about nuking his video by copyright so I understand why he can't say more.

Edit: my personal speculation is that he at least saw stuff from Ubisoft cus he is buddy with some of them and you know Ubi is mostly a french studio eventhough it has many branches allover the wolrd but it is easier to get stuff from a French studio while you are in France and have connections. But he said many companies so he saw more.
 

modiz

Member
Oct 8, 2018
17,939
Having PSSSR being better than FSR 2 is not a fetat at all to brag about. Everythign is better than FSR 2 so if they claim victory for being better than the worst thing ever created they really should push the things much much more before release and not rely on future iterations of PSSSR to solve things out.
Reason that they compare to FSR2 is because thats the main 3rd party alternative on PS. Its pointless to compare to DLSS because PS5 does not support it.

It is basically telling developers it is a better option than what they previously had on the console.
 
Last edited:

PLASTICA-MAN

Member
Oct 26, 2017
23,948
Reason that they compare to FSR2 is because thats the main 3rd party alternative on PS. Its pointless to compare to DLSS because PS5 does not support it.

It is basically telling developers it is a better option than what they previously had on the console.

Yes I can understand that but how much it is better it remains to be seen.
 
Last edited:

Napalm_Frank

Avenger
Oct 27, 2017
5,757
Finland
The level PSSR needs to reach is "good enough". The main issue are games like Rebirth where you immediately can tell the big hit to the IQ without pixel counting. If you can reach the level of, say, GoW Ragnarok perf mode where you most of the time cannot even tell the difference between perf and quality mode IQ, you got yourself a win.
 

PLASTICA-MAN

Member
Oct 26, 2017
23,948
I don't know if this is directly related to the PS5 Pro but it started that way anyway:


View: https://twitter.com/dark1x/status/1780341879919288359


Kepler faslely assumed that coding to the metal doesn't happen anymore and that optimisations are things of the past but he got halted by our John who witnessed insane stuff. I wonder if coding to the metal he saw was related to the PS5 Pro or in a certain game in general.
 

Zerpette

Member
Jun 23, 2023
726
This is a massive exaggeration in the other direction. 120hz TVs have been around for years, millions of people own then. Even just LG OLED TVs like the CX I got 3 years ago have sold in the multiple millions, and they target exactly the sorts of enthusiasts who buy consoles early.
Also, I think an argument can be made that an existing PS5 owner that doesn't have a 4K HDR 120Hz set would get far more benefit out of upgrading their TV rather than upgrading to a PS5 Pro.
 

gt123

Member
Apr 2, 2021
1,533
I don't know if this is directly related to the PS5 Pro but it started that way anyway:


View: https://twitter.com/dark1x/status/1780341879919288359


Kepler faslely assumed that coding to the metal doesn't happen anymore and that optimisations are things of the past but he got halted by our John who witnessed insane stuff. I wonder if coding to the metal he saw was related to the PS5 Pro or in a certain game in general.

Sounds like a game, and does seem like console related at least.


View: https://twitter.com/dark1x/status/1780356333142786266
 

Doctor Avatar

Member
Jan 10, 2019
2,654
Yes I can understand that but how much it is better it remains to be seen.

Indeed - the stills look good but then stills are most flattering when the main problem with AI reconstruction is fast movement.

Good thing is that they can update it over time, as we have seen with DLSS and XeSS. How it is on launch will be the floor for quality.
 

bitcloudrzr

Member
May 31, 2018
14,304
We are seeing it because of h/w targets and when we get a game which didn't target XBO we're not getting anything close to a stable usable 60 either.
From my casual viewing of DF analysis over the last year or so it seem like the vast majority of current gen releases aren't hitting anything close to a locked stable 60 anymore.


Yep, same reason: these games are/were made for previous gen h/w.
There are plenty of games that are current gen only, that are a generally solid 60fps. Frame drops here and there still happened on cross gen games, not major issues like say Immortals.
 

PLASTICA-MAN

Member
Oct 26, 2017
23,948
Indeed - the stills look good but then stills are most flattering when the main problem with AI reconstruction is fast movement.

Good thing is that they can update it over time, as we have seen with DLSS and XeSS. How it is on launch will be the floor for quality.

What worries me is the launch iteration even the document mentions that lot of improvements are coming in later versions like support for upsclaed 8K and more optimisations. If launch version is just slightly improved over crappy FSR 2 and much behind let's say DLSS 2.0, we are scerwed, cus even FSR 3.1 is supposed to improve things aand is aroudn the corner and FSR 4.0 will happen mostly before the PSSSR version getting updated and you know how many games are hungry for such upsacling tech to off load the burdens of Rt and performance to it. So if the launch version is meh, we may not expect the games who will support to get updated each time with another version each time so we will be stuck with a FSR 2.3 like(i Know I am exaggerating but still, if this happens we won't be that far away).
 

Mango Pilot

Alt account
Banned
Apr 8, 2024
480
Not much more than what i said. He is just discussing the the verge article talking about the new bosot mode but he said he spoke to many devs from many studios who showed him things (mostly games) and he insinuated that you will be surprised. I am sure he can't say more cus he may fear the MLID case which he discussed too about nuking his video by copyright so I understand why he can't say more.

Edit: my personal speculation is that he at least saw stuff from Ubisoft cus he is buddy with some of them and you know Ubi is mostly a french studio eventhough it has many branches allover the wolrd but it is easier to get stuff from a French studio while you are in France and have connections. But he said many companies so he saw more.
Surprised meaning that the improvements are better than specs indicate?
 

Mango Pilot

Alt account
Banned
Apr 8, 2024
480
What worries me is the launch iteration even the document mentions that lot of improvements are coming in later versions like support for upsclaed 8K and more optimisations. If launch version is just slightly improved over crappy FSR 2 and much behind let's say DLSS 2.0, we are scerwed, cus even FSR 3.1 is supposed to improve things aand is aroudn the corner and FSR 4.0 will happen mostly before the PSSSR version getting updated and you know how many games are hungry for such upsacling tech to off load the burdens of Rt and performance to it. So if the launch version is meh, we may not expect the games who will support to get updated each time with another version each time so we will be stuck with a FSR 2.3 like(i Know I am exaggerating but still, if this happens we won't be that far away).
PSSR is machine learning which should put it above fsr 2, which isn't.

I think that's AMD's biggest issue. Though, I'm no expert.
 

bitcloudrzr

Member
May 31, 2018
14,304
Surprised meaning that the improvements are better than specs indicate?

Yeah normally this is what we should expect.
We actually do not have the whole picture of the GPU and features even with the spec leaks.

PSSR is machine learning which should put it above fsr 2, which isn't.

I think that's AMD's biggest issue. Though, I'm no expert.
The only reason this much effort would go into PSSR, would be if they had a solution that can perform and look better than FSR by a good margin. This is costing them millions in hardware and software development so it better be.
 

Mango Pilot

Alt account
Banned
Apr 8, 2024
480
We actually do not have the whole picture of the GPU and features even with the spec leaks.


The only reason this much effort would go into PSSR, would be if they had a solution that can perform and look better than FSR by a good margin. This is costing them millions in hardware and software development so it better be.
Yeah, if FSR was good enough they wouldn't have gone out of there way to do something unique.
 

bitcloudrzr

Member
May 31, 2018
14,304
Yeah, if FSR was good enough they wouldn't have gone out of there way to do something unique.
And as a result of how closely the work with AMD on the APU, all of this work on PSSR and RT hardware will feed back to AMD's own hardware.

I am pretty sure FSR 4.0 will be a decent upgrade. AMD will surely not stay there being frozen unless they enjoy the backlash.
FSR 4 might include similar dedicated hardware support as PSSR.
 

Andromeda

Banned
Oct 27, 2017
4,864
I don't know if this is directly related to the PS5 Pro but it started that way anyway:


View: https://twitter.com/dark1x/status/1780341879919288359


Kepler faslely assumed that coding to the metal doesn't happen anymore and that optimisations are things of the past but he got halted by our John who witnessed insane stuff. I wonder if coding to the metal he saw was related to the PS5 Pro or in a certain game in general.

Well we know for a fact that Naughty dog were doing insane stuff (literally coding to the metal) with PS3 hardware in TLOU.

www.eurogamer.net

PlayStation 5 uncovered: the Mark Cerny tech deep dive

On March 18th, Sony finally broke cover with in-depth information on the technical make-up of PlayStation 5. Expanding …

"The Tempest engine itself is, as Cerny explained in his presentation, a revamped AMD compute unit, which runs at the GPU's frequency and delivers 64 flops per cycle. Peak performance from the engine is therefore in the region of 100 gigaflops, in the ballpark of the entire eight-core Jaguar CPU cluster used in PlayStation 4. While based on GPU architecture, utilisation is very, very different.

"GPUs process hundreds or even thousands of wavefronts; the Tempest engine supports two," explains Mark Cerny. "One wavefront is for the 3D audio and other system functionality, and one is for the game. Bandwidth-wise, the Tempest engine can use over 20GB/s, but we have to be a little careful because we don't want the audio to take a notch out of the graphics processing. If the audio processing uses too much bandwidth, that can have a deleterious effect if the graphics processing happens to want to saturate the system bandwidth at the same time."

Essentially, the GPU is based on the principle of parallelism - the idea of running many tasks (or waves) simultaneously. The Tempest engine is much more serial-like in nature, meaning that there's no need for attached memory caches. "When using the Tempest engine, we DMA in the data, we process it, and we DMA it back out again; this is exactly what happens on the SPUs on PlayStation 3," Cerny adds. "It's a very different model from what the GPU does; the GPU has caches, which are wonderful in some ways but also can result in stalling when it is waiting for the cache line to get filled. GPUs also have stalls for other reasons, there are many stages in a GPU pipeline and each stage needs to supply the next. As a result, with the GPU if you're getting 40 per cent VALU utilisation, you're doing pretty damn well. By contrast, with the Tempest engine and its asynchronous DMA model, the target is to achieve 100 percent VALU utilisation in key pieces of code." "

Sorry for long quote, I thought it all might be relevant .
Yes theoretically developers can already use one whole thread (half of the Tempest CU?) on any tasks, even graphical ones. Like once I thought it could be used to offload some RT related tasks. Ideally tasks that would work very well with the DMA model (no cache, direct access to main memory).

But I don't believe in them using tempest engine 2 for int-8 processing (and the CUs would need too many modifications, essentially becoming tensor cores). I think it would be simpler for them to use the new RDNA4 WMMA units. This is the kind of thing Sony would absolutely do: use features not yet released on PC ecosystem. They did it with PS4 Pro. And they should also do it, allegedly, with new RDNA4 RT hardware.


View: https://twitter.com/Kepler_L2/status/1768590437206339657
 

Mango Pilot

Alt account
Banned
Apr 8, 2024
480
I am pretty sure FSR 4.0 will be a decent upgrade. AMD will surely not stay there being frozen unless they enjoy the backlash.
That would be funny if devs have to choose between FS 4 and PSSSSR in the future. XD
When is FSR 4 coming? Two years? We haven't even seen the new FSR 3 released that they just previewed with those ratchet screens.
 

Spoit

Member
Oct 28, 2017
4,064
There is no downplaying of anything.
This generation has gotten a lot of 60 fps games so far because it got a) a very long cross-gen period with games being designed for PS4/XBO (or even Switch) baselines and b) better resolution reconstruction approaches which allow somewhat okayish reconstruction from down to 1/4th of display resolution.
Both of these are out of the picture once we're past cross-gen as both depend on the baseline h/w target. Expecting everyone to just continue targeting PS4/XBO graphics for some reason is what's "downplaying reality" here.
And the compromises they keep making to hit 60 FPS are getting really rough. Like the 432p on aveum, or bizzarely using FSR1 or freaking nearest neighbor upscaling from 720p with FF
 

modiz

Member
Oct 8, 2018
17,939
Of course that these kinds of solutions will get refined more as time goes on.
The secret with deep learning is no one actually has a clue why their solution "worked" (kinda), because those models are really complicated, improvements from where the faults are found would be a constant attempt at making up heuristics to the problem as well as trial and error until it works well. They won't release a solution that suddenly blows everything else out of the water unless if they have something really novel going on here (could be the spectral transformation I wondered about, but wouldn't bet on it alone being some silver bullet), as well as being very lucky.

Well we know for a fact that Naughty dog were doing insane stuff (literally coding to the metal) with PS3 hardware in TLOU.


Yes theoretically developers can already use one whole thread (half of the Tempest CU?) on any tasks, even graphical ones. Like once I thought it could be used to offload some RT related tasks. Ideally tasks that would work very well with the DMA model (no cache, direct access to main memory).

But I don't believe in them using tempest engine 2 for int-8 processing (and the CUs would need too many modifications, essentially becoming tensor cores). I think it would be simpler for them to use the new RDNA4 WMMA units. This is the kind of thing Sony would absolutely do: use features not yet released on PC ecosystem. They did it with PS4 Pro. And they should also do it, allegedly, with new RDNA4 RT hardware.


View: https://twitter.com/Kepler_L2/status/1768590437206339657

Kepler is probably wrong about that, the documents say the hardware solution is fully custom, not semi custom (which would be the case if they just back ported an AMD feature). They have designed their own solution to ML acceleration.
 

bitcloudrzr

Member
May 31, 2018
14,304
And the compromises they keep making to hit 60 FPS are getting really rough. Like the 432p on aveum, or bizzarely using FSR1 or freaking nearest neighbor upscaling from 720p with FF
Although the people preferring performance mode will likely still choose it for solid 60fps even with reduced image quality, there are plenty of current gen only games that do not suffer from this. At least they are using upscaling that is better than the FSR 1 or whichever technique was in FF7R-2.