• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.
  • We have made minor adjustments to how the search bar works on ResetEra. You can read about the changes here.

Lukas Taves

Banned
Oct 28, 2017
5,713
Brazil
Yes, but DLSS 2.0 is Nvidia's exclusive tech. DLSS wont be on AMD consoles
Ok, they still won't get dlss2.0 since consoles are using amd tech.

It remains to be seen if AMD have an equivalent tech, but they have not announced anything so far.
They won't be getting DLSS, but there's nothing preventing Ms and Sony to come up with their own solutions, in fact Ms is already demoing one implementation that can also increase resolution from 540-1080p, and from 1080p to 4k.

Not when you know you're always using said dedicated cores (which is the case here). Dedicated hardware always in use is always superior.
Actually the tensor cores were not being utilized until now. I also thought that they were used before but that is not the case.

And you are mixing hardware acceleration which the consoles (or at least sx and lockarhart) will have. There's no reason for dedicated hardware to inherently perform better than another that has the same hardware instructions save for the number of processors available.

And yes, the tensor cores in rtx have a higher number of processors than what SX will have, but that does not mean a similar tech cannot be used, heck we don't even know how much this implementation is even using.

You're forgetting that the tensor cores are much, much more efficient at doing these kinds of ML workloads than shader cores. The amount of taken up die space is not relevant in this case. And they don't take up as much die space as you think. I believe the number was 7%, could be wrong though.
That's precisely why they added hardware instructions, so they are inherently the same. The number of processors makes a difference, and RTX gpus do have more processors to handle that, but the counter-argument is that on consoles the space dedicated for that is also used for other tasks.

Have you read the full quote yourself? That's referring to the shader cores being able to run 8-bit and 4-bit operations at that level of precision, which is why the TOP figures are what they are.



Again, that's not what the quote is saying. No hardware was added; the shader cores were tweaked to run low-precision workloads more efficiently, not unlike "Rapid Packed Math" (AMD's buzzword for using a single FP32 operation to run two FP16 operations).
That's exactly what the hardware they added allows. They added support for dealing with int matrix math, and support to perform multiple of these operations at once since a int is a lot smaller than a full fp32 number.

And I'm not basing solely on this quote, I'm talking about the gdc presentation where Ms first said that would be the approach they would take and talked about the performance considerations. With the modifications they made the hardware IS able to use hardware instructions to deal with machine learning workloads.

And yes, the sheer performance won't match that of a 2080ti as they have significant more processors to handle that than what SX has.
We still do not know if PS5 supports increased rate int8.
XSX does.
XSX is like 49 int8 TOPs - RTX 2080 Ti is 220 at ca. 1500mhz (most 2080 Tis run at 1800-1950 mhz in real life though).
Damn, why sony just don't confirm what this hardware has or doesn't XD

And do you know how many TOPs does a 2060 have? Since it's able to run on it, I assume this implementation won't need no where near close to the sheer processing of a 2080ti.

The consoles may have mixed precision support in their shader ALUs, but DLSS here is running on tensor cores. So on the consoles, without further hardware, they'll be trading native rendering performance against reconstruction performance/time. Whether you wind up with a net benefit to performance as in DLSS would be an open question. It certainly wouldn't be as beneficial.

I wonder also how useful lower precision support in shader alus will be for image reconstruction, below fp16 anyway, before you start introducing new problems. Not sure what the answer is there.

I think it's maybe more possible we will see different reconstruction techniques on the consoles that are lighter at processing time than a DLSS-alike would be running on shader ALUs, but better than what we had before on consoles. And maybe even using some neural network processing, but maybe trading generality (as in DLSS) for lower overhead if running on shaders.
Consoles would have to trade ALU for machine learning tasks, but ALU is usually cheap as pretty much anything is always more bandwidth bound than alu bound, especially if you are already targeting a lower resolution, likely just the bandwidth gains from that will more than compensate the ALU you lose.
 

Lukas Taves

Banned
Oct 28, 2017
5,713
Brazil
This is an Nvidia technology running on dedicated hardware. Consoles won't have DLSS.
But this is machine learning, Nvidia is hardly the only one researching how to upscale images using ML, there are even open source ones already;

And with Ms already researching and demoing something similar, I'd say it's a given they will have an equivalent solution for both LH and SX, even if not as good (but what they demoed already deals with sub pixel details, so it will likely be good enough)
 

Cow Mengde

Member
Oct 26, 2017
12,730
I...I've seen the future and it is 540p!

That's crazy! Here I am sitting here glad that I bought a 1660 Super cause I didn't need ray tracing or DLSS. I still don't care for RT, but DLSS would be amazing! Would allow me to use my GPU for far longer than normal. Now I'll have to wait for the 2060 TI or 2060 Super to drop in price.
 

Zojirushi

Member
Oct 26, 2017
3,300
Is there even any point in buying an AMD GPU until they implement something that performs similiarly well? You'll have the 3060 performing better than the highest end AMD card. With better RT too probably

It's Nvidia so maybe you'll also have the 3060 costing more than the highest end AMD card lol
 

Alucardx23

Member
Nov 8, 2017
4,713
"Dedicated hw" goes into semantics and grammar.
People are trying to differentiate between discrete processing units (like Tensor cores) and extensions to the SIMD execution units to process 4/8/16 of 32/16/8 bit precision types in parallel.
To be more specific - PS2, XBox(original), PS3, and 360 all had hardware accelerated support for the latter - but it didn't get additional labels other than SIMD acceleration hardware.

There is no discussion there. I could't have been more specific about what was the intention of the dedicated hardware that was added by Microsoft and how it works in conjunction with the compute units. I provided the quotes and the links that make this clear for everyone to read. This started because someone said "this won't be on consoles." in the context of machine learning upscaling being used on next gen console games. I made clear that this should be happening at least on the XSX because Microsoft made the relevant hardware changes in order to make it possible. This says nothing about the performance relative to the RTX GPUs and how they handle their dedicated hardware solution.
 

Zojirushi

Member
Oct 26, 2017
3,300
Someone remind me, how were Nvidia exclusive graphics features (HBAO+ for example) handled in past titles that had AMD marketing on PC? Were they purposefully left out?
 

Vimto

Member
Oct 29, 2017
3,714
I...I've seen the future and it is 540p!

That's crazy! Here I am sitting here glad that I bought a 1660 Super cause I didn't need ray tracing or DLSS. I still don't care for RT, but DLSS would be amazing! Would allow me to use my GPU for far longer than normal. Now I'll have to wait for the 2060 TI or 2060 Super to drop in price.
1660 doesn't support dlss
 

Deleted member 20297

User requested account closure
Banned
Oct 28, 2017
6,943
I love all these new technologies to influence the final image we see on a screen and the whole topic will be interesting to see evolve in the future for sure. CB and hardware upscalers were only the beginning I feel.
 

Weeniekuns

Member
Oct 27, 2017
1,111
Loving my nvidia shield... it pretty much does the same thing to all my 480p/720p/1080p movies and media.

images_BrandPromos_NVIDIA_Shield2019_GeckoImg.jpg
 

KKRT

Member
Oct 27, 2017
1,544
Is there even any point in buying an AMD GPU until they implement something that performs similiarly well? You'll have the 3060 performing better than the highest end AMD card. With better RT too probably
Not really, unless DLSS wont be adopted heavily and going by DLSS 2.0 not needed to be trained anymore, and TAA implementation 'replacement', and UE 4 having a beta branch with DLSS implementation, it will be adopted by a lot of games.
AMD's GPU division will be in a really tough spot from now.
 

Vimto

Member
Oct 29, 2017
3,714
I love how people are saying AMD in danger before even Ampere is revealed / Big Navi

It's too early for that.
 

Trup1aya

Literally a train safety expert
Member
Oct 25, 2017
21,401
This is amazing tech. I can't believe how quickly things are improving.
 

WRC

Member
Oct 28, 2017
144
fyi

imgur.com

imgur.com

Imgur: The magic of the Internet

So if 2080 ti has ~220tops, 2060 rtx has ~100 tops and xsx 49 tops, according to this graph the rtx 2060 will do roughly ~2.83ms @ 4k, hipothetically the xsx with same algorithm should cost about 5.5ms @4k.
 

Fafalada

Member
Oct 27, 2017
3,074
the intention of the dedicated hardware that was added by Microsoft and how it works in conjunction with the compute units.
My point was there's no dedicated hw working in conjunction with compute units on the SXS. It extends SIMD execution units to support multiple data-widths - ie. it's a building block 'of' the CUs if you prefer.

This started because someone said "this won't be on consoles." in the context of machine learning upscaling being used on next gen console games.
I think the question mark is more on DLSS itself, NVidia doesn't have a history of sharing this type of tech. MS has clearly demonstrated they are interested in investing in the space though so expectation would be they'll provide their own approach to it.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,934
Berlin, 'SCHLAND
fyi

imgur.com

imgur.com

Imgur: The magic of the Internet

So if 2080 ti has ~220tops, 2060 rtx has ~100 tops and xsx 49 tops, according to this graph the rtx 2060 will do roughly ~2.83ms @ 4k, hipothetically the xsx with same algorithm should cost about 5.5ms @4k.
Assuming linear scaling, yeah this is what we have basically calculated out for ourselves as well.

5.5 ms btw is still A LOT faster than running the game at 4x the resolution.
 

JahIthBer

Member
Jan 27, 2018
10,389
Not really, unless DLSS wont be adopted heavily and going by DLSS 2.0 not needed to be trained anymore, and TAA implementation 'replacement', and UE 4 having a beta branch with DLSS implementation, it will be adopted by a lot of games.
AMD's GPU division will be in a really tough spot from now.
Only problem is AMD sponsored games won't use Nvidia tech. It's always a bit of a disappointment when AMD sponsors a game like Resident Evil 2 remake, which desperately needs ray traced reflections & instead AMD offers a logo & nothing more. The one cool thing they ever did was TresFX.
 

Muhammad

Member
Mar 6, 2018
187
OK, now you are being dishonest here. Why are you ignoring where it says "So we added special hardware support for this specific scenario."? If I ask you if the XSX has special hardware to accelerate machine learning code, is your answer no it doesn't?
Special hardware means special support for these functions on the CUs of the Series X. NVIDIA has DEDICATED units for them, so they don't need to run them on the CUs.
 

space_nut

Member
Oct 28, 2017
3,306
NJ
Assuming linear scaling, yeah this is what we have basically calculated out for ourselves as well.

5.5 ms btw is still A LOT faster than running the game at 4x the resolution.

Thats huge devs can push the XSX at 1080p and use ML to render at 4K. Can't wait for more demos like the path tracing on XSX
 

Edgar

User requested ban
Banned
Oct 29, 2017
7,180
Only problem is AMD sponsored games won't use Nvidia tech. It's always a bit of a disappointment when AMD sponsors a game like Resident Evil 2 remake, which desperately needs ray traced reflections & instead AMD offers a logo & nothing more. The one cool thing they ever did was TresFX.
but at least tress fx runs on nvidia gpus.
 

Stacey

Banned
Feb 8, 2020
4,610
For a minute there I thought i'd mistakenly entered the XSX deep dive thread.

So much speculation.
 

Alucardx23

Member
Nov 8, 2017
4,713
My point was there's no dedicated hw working in conjunction with compute units on the SXS. It extends SIMD execution units to support multiple data-widths - ie. it's a building block 'of' the CUs if you prefer.

Special hardware means special support for these functions on the CUs of the Series X. NVIDIA has DEDICATED units for them, so they don't need to run them on the CUs.

Not sure I understand what you're saying here. This is a physical solution added to the CU, in order to allow them to run the machine learning code faster. This physical part of the hardware will be active while it's running the machine learning code. Let's not waste more time talking about how tensor cores are a fully dedicated solution to run machine learning code, while the modified CU on the XSX are not. I'm specifically talking about how they were modified and the reason for that. This was not my point at all and it should be clear after reading what I wrote and the quotes and links I shared.

I think the question mark is more on DLSS itself, NVidia doesn't have a history of sharing this type of tech. MS has clearly demonstrated they are interested in investing in the space though so expectation would be they'll provide their own approach to it.

This is not about the name of the solution, but what the solution actually does and allows. Microsoft has already shown their own machine learning solution with Directml, how you name it is not relevant to what that person was wishing to see on consoles.
 

Alucardx23

Member
Nov 8, 2017
4,713
The CU will be occupied running the ML code, while on NVIDIA, the CUs will be free to run normal code, while Tensors run the ML code.

Again, what is the problem here? The CUs were modified physically to accelerate machine learning code. Actual hardware was added to the CU to enable them to accelerate machine learning code.
 

Deleted member 8752

User requested account closure
Banned
Oct 26, 2017
10,122
How does this differ from say... checkerboard rendering? Just curious as to what is actually happening here.
 

Muhammad

Member
Mar 6, 2018
187
Again, what is the problem here? The CUs were modified physically to accelerate machine learning code. This is not a software solution. Actual hardware was added to the CU to enable them to accelerate machine learning code.
No acceleration is done here, they still run on the CUs, which means they consume regular CU TFLOPs available for graphics. While NVIDIA's solution don't use the CUs and don't consume graphics TFLOPs.
 

Alucardx23

Member
Nov 8, 2017
4,713
No acceleration is done here, they still run on the CUs, which means they consume regular CU TFLOPs available for graphics. While NVIDIA's solution don't use the CUs and don't consume graphics TFLOPs.

Thank you. I will stop talking to you now.

"We knew that many inference algorithms need only 8-bit and 4-bit integer positions for weights and the math operations involving those weights comprise the bulk of the performance overhead for those algorithms," says Andrew Goossen. "So we added special hardware support for this specific scenario. The result is that Series X offers 49 TOPS for 8-bit integer operations and 97 TOPS for 4-bit integer operations. Note that the weights are integers, so those are TOPS and not TFLOPs. The net result is that Series X offers unparalleled intelligence for machine learning."

www.eurogamer.net

Inside Xbox Series X: the full specs

This is it. After months of teaser trailers, blog posts and even the occasional leak, we can finally reveal firm, hard …
 

Wumbo64

Banned
Oct 27, 2017
327
Dictator

I have read a couple articles and watched the DF stuff on DLSS, thanks for the hard work and good breakdown. I do have some questions however...

Has Nvidia described how the process of implementing DLSS into your game works? Do you have to apply for a license? Does NVIDIA need your codebase? Does it take a long time? Can this be back-ported to older stuff easily?

This is definitely promising tech, but obviously if it is locked behind some arduous implementation process from Nvidia, it could blunt mass adoption...
 

Muhammad

Member
Mar 6, 2018
187
Thank you. I will stop talking to you now.

"We knew that many inference algorithms need only 8-bit and 4-bit integer positions for weights and the math operations involving those weights comprise the bulk of the performance overhead for those algorithms," says Andrew Goossen. "So we added special hardware support for this specific scenario. The result is that Series X offers 49 TOPS for 8-bit integer operations and 97 TOPS for 4-bit integer operations. Note that the weights are integers, so those are TOPS and not TFLOPs. The net result is that Series X offers unparalleled intelligence for machine learning."
You are obviously less informed in this matter, I will refer you to this quote here:

The difference between having separate hardware (NVIDIA's tensor cores) and having hardware support inside the shader cores (XSX approach) is that in the former scenario, you don't have to compromise on GPU compute power for native rendering in order to support a DLSS-like feature, while in XSX case, you need to share resources between native rendering and the upscaling AI algorithm. For example: the RTX2060 has 100 TOPS of tensor core performance. If we assume that we need 25 TOPS to do upscaling from 1080p to 4K, then it can do that on the tensor cores, and use its full 6 TFLOPS of ALU power to render the native 1080p image. The XSX, on the other hand, needs to apply 25 TOPS of shader compute power out of its 49 TOPS total, or roughly half of its GPU, to the DLSS-like algorithm, leaving only half of its GPU for native 1080p rendering. If it takes half the GPU to perform the DLSS algorithm, you lose quite a bit of the potential benefit.
 

JahIthBer

Member
Jan 27, 2018
10,389
Those RDNA2 CU's are going to be pretty busy with ML & Ray Tracing huh, makes me think 52 CU's instead of 36 at high clocks was done for a reason, it's all very interesting.
 

Fafalada

Member
Oct 27, 2017
3,074
This is a physical solution added to the CU
This is really getting into semantics - but extending SIMD execution units for different data-widths isn't adding so much as modifying things. There's a reason SIMD implementation have done this since late 90s/early 00s - it has a relatively negligible impact on execution unit size/complexity.
But fair enough - it's 'added functionality' so I'll stop the conversation here.
 

GrrImAFridge

ONE THOUSAND DOLLARYDOOS
Member
Oct 25, 2017
9,682
Western Australia
That's exactly what the hardware they added allows. They added support for dealing with int matrix math, and support to perform multiple of these operations at once since a int is a lot smaller than a full fp32 number.

And I'm not basing solely on this quote, I'm talking about the gdc presentation where Ms first said that would be the approach they would take and talked about the performance considerations. With the modifications they made the hardware IS able to use hardware instructions to deal with machine learning workloads.

Adding operational modes that process low-precision workloads more efficiently isn't exactly adding hardware (unless you're referring to the shader cores themselves versus their X1/X counterparts, in which case, sure), but that's beside the point at this juncture. Nobody was claiming the XSX GPU doesn't have the means to accelerate DirectML workloads, just that it doesn't have hardware designed for and aimed squarely at this singular purpose a la tensor cores; the latter is what myself and others who've replied to Alucardx23 were referring to with "dedicated hardware". I'm sure you can agree there's a distinction to make between machine learning-oriented shader cores and a pool of discrete machine learning-oriented cores, and that's what we were trying to illustrate to Alucardx23, as he was conflating the two. That's all. The argument was never "The XSX GPU can't do what the tensor cores do."

Late edit: You may also want to read Liabe Brave's posts here and here (the latter especially). As it turns out, the ability to use an FP32 operation to process 4x INT8 or 8x INT4 isn't something Microsoft added; rather, it's inherent to the RDNA architecture. This means, incontrovertibly and unequivocally, that the XSX GPU isn't unique in that regard: said functionality is also supported by the PS5 GPU, any RDNA2-based GPUs AMD has in R&D, and even last year's RDNA1-based Radeon 5700 XT.
 
Last edited:

Smokey

Member
Oct 25, 2017
4,176
That's the downside of going with AMD.

Nvidia Raytracing + DLSS is going to be the real game changer here.

I really wish nvidia and ms/sony would patch up their differences,pricing, or whatever. I have way more confidence with nvidia and this type of tech than AMD.
 

Lom1lo

Member
Oct 27, 2017
1,435
Lol not the only one. I'm almost positive that XSX will do 4k natively 99.7% of the time
If you see dlss 2.0, native 4k feels just stupid. It's insane how much Ressources you save and can put into stuff like raytracing. It's probably also currently the only way to have raytracing, high IQ and good framerates.
 

gothmog

Member
Oct 28, 2017
2,434
NY
Could these type of techniques help reduce latency in cloud gaming? Like basically render the game world in a low resolution, stream it to all clients, and then let them upscale it using something like DLSS?
 

Uzzy

Gabe’s little helper
Member
Oct 25, 2017
27,284
Hull, UK
Feels like black magic, and when this gets into all games it'll be an absolute game changer. The improvement in the tech in just under a year makes me very excited to see where it'll be in another years time, another three years time.