• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.
  • We have made minor adjustments to how the search bar works on ResetEra. You can read about the changes here.

Utherellus

Banned
Mar 31, 2020
181
No it requires the developer to implement it. It requires access to certain screen information that is not really accessed at the driver level.

Sad to hear. But I mean most of AAA developers should have the priority to implement it, coz DLSS simply boosts the performance and image quality so they get potentially more buyers - more wider hardware spectrum owners.
 

Alucardx23

Member
Nov 8, 2017
4,713
No, they're very clearly talking about DirectML running on shader cores and how they were adapted to run low-precision workloards more efficiently. You're zeroing in on "special hardware support" and ignoring points 2 and 3, both of which categorically state that DirectML relies on shader cores.

Again, nobody is disputing that the XSX is capable of DirectML-based reconstruction, but tweaked shader cores doesn't equal dedicated hardware, as those cores are still general-purpose -- there are no cores whose express purpose is to run DirectML code.

Yes and machine learning code runs on low-precision workloards. What does the special hardware that was added on the XSX allows? Guess what? It allows it to accelerate low precision workloards. What do the tensor cores do? Accelerate low-precision workloards.

1- Machine learning code = low-precision workloards.

2- XSX has special hardware to accelerate low-precision workloards.

3- Special hardware to run low-precision workloards = Better performance running machine learning code.

4- Special hardware dedicated to run an specific type of code = dedicated hardware to run that code.

Hope this helps.
 
Feb 28, 2020
57
EUtPt2YWoAE6ewF
The DLSS version looks better then the native one.... and performs better... gg Nvidia.
 

Zedark

Member
Oct 25, 2017
14,719
The Netherlands
I know the difference and I have even repeated several times that it is likely that this generation RTX GPU will have higher performance than the XSX running the same machine learning code. The point was to make clear that the XSX does have specialized hardware to run machine learning code and this means that Microsoft is turning this into a usable option for developers. For example, let's say that the XSX won't give good results with upscaling a 540P image to 1080P due to the low sample amount, but it will be able to do a good job upscaling a 1440P image to 4K, enabling developers to increase performance relative to running the same game natively at 4K. We are yet to see what will be possible, and if the improvement from DLSS 1.9 to 2.0 on the same hardware has shown anything, is that there is room to grow on the software side as well.
Alright, that's fair. It remains to be seen how much of a benefit you can get when you have to divert compute power from the shaders, but that's something neither you nor I can make a clear statement on at this point in time, so we'll see. It would definitely be a boon if rendering at 1440p and upscaling to 4K gives an overall better result with lower performance demands. Another question to ask in the XSX case is whether rendering at a native resolution in-between 1440p and 4K (say, 1800p) and using a simpler upscaling technique might not work as well or better: after all, you free the GPU from the cost of the DLSS, so you have extra compute available to beautify the image (which again is counterbalanced by the need to render more pixels). Again, though, I wouldn't know - just spitballing here. In the end, having more options to arrive at a great visual presentation can only be a good thing.
 

Black_Stride

Avenger
Oct 28, 2017
7,389
You have some things incorrect here. RTX 2060 is around 100 TOPs - the XSX is 49 for int8. int4 TOPs is double for both - so around 200 for RTX 2060 and 97 for XSX.
No it requires the developer to implement it. It requires access to certain screen information that is not really accessed at the driver level.

Is DLSS free for Developers to implement assuming they get approval from Nvidia?
Because, if it is then pretty much every medium to large scale developer should be thinking of getting DLSS support.
The more the AI trains the better the results....every AAA game going forward better have DLSS because its some black magic.



Totally off topic.
How did Conan O Brian get a feature in Death Stranding yet this guy didnt vvv
3WKZjC3b_400x400.jpg


Hope you feeling better, great video BTW.
 

Alucardx23

Member
Nov 8, 2017
4,713
Alright, that's fair. It remains to be seen how much of a benefit you can get when you have to divert compute power from the shaders, but that's something neither you nor I can make a clear statement on at this point in time, so we'll see. It would definitely be a boon if rendering at 1440p and upscaling to 4K gives an overall better result with lower performance demands. Another question to ask in the XSX case is whether rendering at a native resolution in-between 1440p and 4K (say, 1800p) and using a simpler upscaling technique might not work as well or better: after all, you free the GPU from the cost of the DLSS, so you have extra compute available to beautify the image (which again is counterbalanced by the need to render more pixels). Again, though, I wouldn't know - just spitballing here. In the end, having more options to arrive at a great visual presentation can only be a good thing.

Yup, we have to wait, but for me that is the only real way next gen console games will be able to have good performance with ray tracing on. So far machine learning upscaling is the best solution if you have the hardware that accelerates machine learning code. See the video below, it shows several comparisons with regular upscaling techniques like TAAU.

GTC 2020: DLSS 2.0 - Image Reconstruction for Real-time Rendering with Deep Learning

In this talk, Edward Liu from NVIDIA Applied Deep Learning Research delves into the latest research progress on Deep Learning Super Sampling (DLSS), which uses deep learning and the NVIDIA Tensor Cores to reconstruct super sampled frames in real-time. He discusses and demonstrates why scaling...
 
Jan 21, 2019
2,902
First off, that was a fantastic video. Great job Alex.
Second, I am now a true believer of DLSS. I can't imagine not reconstructing from 1440P to 4K. It just seems illogical. However, I have to disagree on the 540p to 1080p upscale. When Alex said something like it looks better while Jesse was entering the building I loadly exclaimed "It does not!". While the still shot that he gave shortly after looked better, in motion the artifacts around Jesse looked very very bad to me and I attribute it to the oversharpening. If the sharpening can be reduced by the user, I can imagine it becoming an option.
Thirdly, I wonder if we can DLSS to 8K using 4K as a base and see the results then or an 8K screen and 4K screen. Here's the thing. When next gen looming, I expect 4K to become the standard for mid to high end gaming on PC. So if the cards can easily do 4K60, why no go beyond with DLSS. The image quality should be crytal clear on a native panel or downsampled to 4K.

Edit: What I mean with using DLSS on a 4K source is that I want the AA power of DLSS on 4K.
 
Last edited:
Oct 26, 2017
9,859
Sad to hear. But I mean most of AAA developers should have the priority to implement it, coz DLSS simply boosts the performance and image quality so they get potentially more buyers - more wider hardware spectrum owners.

You can expect all games with Nvidia marketing to have DLSS, easily.
Cyberpunk 2077, Dying Light 2, Bloodlines 2 for example.
 

Utherellus

Banned
Mar 31, 2020
181
You can expect all games with Nvidia marketing to have DLSS, easily.
Cyberpunk 2077, Dying Light 2, Bloodlines 2 for example.

Those games sure. But it still will be a small list compared to colossal amount of AAA projects out there.

Lets just hope for the best. This is astonishing piece of technology and there must be no excuses both from Nvidia and Devs not to implement it in every second/third game.
 

GrrImAFridge

ONE THOUSAND DOLLARYDOOS
Member
Oct 25, 2017
9,675
Western Australia
Yes and machine learning code runs on low-precision workloards. What does the special hardware that was added on the XSX allows? Guess what? It allows it to accelerate low precision workloards. What do the tensor cores do? Run low-precision workloards.

1- Machine learning code = low-precision workloards.

2- XSX has special hardware to accelerate low-precision workloards.

3- Special hardware to run low-precision workloards = Better performance running machine learning code.

4- Special hardware dedicated to run an specific type of code = dedicated hardware to run that code.

Hope this helps.

I think we're talking at cross-purposes here, so maybe this will get us all on the same page: the "special hardware support" refers to all shader cores being tweaked to run 4 and 8-bit operations at that level of precision. It doesn't refer to the XSX having a special set of shader cores whose singular purpose is to run DirectML code. Ergo, while DirectML code can hardware-accelerated on the XSX, reserving shader cores for this purpose means taking them off the table for other rendering processes, so the XSX doesn't have dedicated hardware a la RTX tensor cores, just general-purpose shader cores that can run low-precision operations more efficiently.

You seem to be under the impression that we've been saying DirectML can't be hardware-accelerated on the XSX, but that's not the case at all. We're just saying it doesn't have hardware expressly dedicated to this particular purpose a la tensor cores on RTX GPUs. Tensor cores are entirely their own thing; they're not a subset of CUDA/RT cores on the GPU that have been earmarked for machine learning operations.
 
Last edited:

orava

Alt Account
Banned
Jun 10, 2019
1,316
This is super impressive and nvidia really has done something unique here.

Using ML like this also have potential to become to something even more interesting. Imagine if you could play games at different art styles or make the game look like entirely another game visually. Also, this combined with rtx could be the end solution to actual photoreal graphics. The final photoreal frame would be built from the rendered game graphics. There's already examples of ML techniques used in this manner where paintings are made to look like they are painted in some another artist's style or photoreal images are created from "nothing".
 

Xx 720

Member
Nov 3, 2017
3,920
Curious to see what Nintendo does with Switch 2, potential 4K in docked mode via diss 2.0? Hope floss becomes standard for PC games, will help push ray tracing in all games.
 

Zedark

Member
Oct 25, 2017
14,719
The Netherlands
Elfotografoalocado Regarding our previous discussion on whether the tensor cores are being used fully: Alex mentions in the video (the 10:00 mark) that the DLSS performance hit compared to the same resolution with DLSS off is higher on the lower-end RTX2060 than on the RTX2080 Ti. I wonder if that is due to the Tensor cores being unable to produce the upscaling fast enough, and whether that shows that the Tensor cores are being stretched to their limit (i.e., that the algorithm does indeed need the full tensor compute power available).
 

Schlomo

Member
Oct 25, 2017
1,133
Exciting times! Maybe this could also make those 8k TVs interesting, which until now most people (including me) thought of as rather useless.
 

Caesar III

Member
Jan 3, 2018
921
This makes Lockhart such a ridiculously good proposition from Ms.

I can see the ads already. Budget 4k next gen gaming for just 299
MS/Sony consoles are AMD, and this is NVIDIA technology (and powered by NVIDIA-specific hardware, namely the tensor cores). We don't know for sure yet, but the consoles quite possibly do not have the hardware needed to provide the boosts discussed for this technology (or actually for a similar, non-NVIDIA technology that AMD has yet to develop/announce).
What if MS went with both and is using Nvidia for Lockhart? Would make their dx Middleware more useful as well
 

Alucardx23

Member
Nov 8, 2017
4,713
I think we're talking at cross-purposes here, so maybe this will get us all on the same page: the "special hardware support" refers to the shader cores being tweaked to run 4 and 8-bit operations at that level of precision. It doesn't refer to the XSX having a subset of shader cores whose dedicated purpose is to run DirectML code. Ergo, while DirectML code can hardware-accelerated on the XSX, reserving shader cores for this purpose means taking them off the table for other rendering processes, so the XSX doesn't have dedicated hardware a la RTX tensor cores, just general-purpose shader cores that can run low-precision operations more efficiently.

I remind you this started with you quoting me to say "the XSX does not have dedicated hardware for this purpose, so committing resources to DirectML reconstruction means taking them away from other rendering processes"

I replied with:

"We knew that many inference algorithms need only 8-bit and 4-bit integer positions for weights and the math operations involving those weights comprise the bulk of the performance overhead for those algorithms," says Andrew Goossen. "So we added special hardware support for this specific scenario. The result is that Series X offers 49 TOPS for 8-bit integer operations and 97 TOPS for 4-bit integer operations. Note that the weights are integers, so those are TOPS and not TFLOPs. The net result is that Series X offers unparalleled intelligence for machine learning." Link

Things should't be so complicated. There is no need to move the conversation to "Oh, but the XSX will also have to use shader cores, so the performance won't be the same as the tensor cores." I have already said that the tensor cores will have more performance, no need to go there as that was never the point. However you want to put it, the XSX does have dedicated hardware to accelerate machine learning code. The minute you admit to this, it invalidates what you said at first "the XSX does not have dedicated hardware for this purpose". It does have it and I have no other way to make this more clear than to show you someone from Microsoft itself saying that they added hardware for this purpose. This will be my last reply to you on this.
 

jett

Community Resettler
Member
Oct 25, 2017
44,659
Much more impressive results than the awfulness showcased in FFXV, that's for sure. They need to get handle on all that sharpening though.
Huge difference, one is a blurry mess but it was funny how people defended it prior.


Dumb question but is this going to be a game by game case or is this built in?
As far as I understand the AI has to be trained specifically for each game.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,931
Berlin, 'SCHLAND
Much more impressive results than the awfulness showcased in FFXV, that's for sure. They need to get handle on all that sharpening though.

As far as I understand the AI has to be trained specifically for each game.
DLSS 2.0 actually does not require per game training anymore. It is game agnostic and you, as a dev, slot it right in where TAA would be in your graphics pipeline.
 

Iron Eddie

Banned
Nov 25, 2019
9,812
Much more impressive results than the awfulness showcased in FFXV, that's for sure. They need to get handle on all that sharpening though.

As far as I understand the AI has to be trained specifically for each game.
Hopefully we get a lot of support and not just a trickle like raytracing so far.

DLSS 2.0 actually does not require per game training anymore. It is game agnostic and you, as a dev, slot it right in where TAA would be in your graphics pipeline.
Sounds good, thanks
 

Lump

One Winged Slayer
Member
Oct 25, 2017
16,035
DLSS 1.9 felt like a compromise that made sense. Sure it looked weird, but it was understandable.

DLSS 2.0 is some goddamn wizardry and doesn't feel fair.
 
Oct 26, 2017
9,859
Those games sure. But it still will be a small list compared to colossal amount of AAA projects out there.

Lets just hope for the best. This is astonishing piece of technology and there must be no excuses both from Nvidia and Devs not to implement it in every second/third game.

These are just three examples, i'm pretty sure Nvidia will make several deals with various publishers to push both their new GPUs and this new tech.
 

Zedark

Member
Oct 25, 2017
14,719
The Netherlands
Hopefully we get a lot of support and not just a trickle like raytracing so far.
Ray tracing should get a lot more traction with the next gen consoles having native support (and presumably sufficient compute power) for it.

DLSS 2.0, in turn, sounds like a much easier thing to add as it slots into a stage in your rendering pipeline. For ray tracing, you needed to modify your pipeline to include a non-rasterisation-based technique into your pipeline. That seems like quite a bit more work than what you would need to do for DLSS 2.0. And even if it doesn't blow up out of the gate, a Switch 2 with support for it could give it the traction it needs.
 

GrrImAFridge

ONE THOUSAND DOLLARYDOOS
Member
Oct 25, 2017
9,675
Western Australia
I remind you this started with you quoting me to say "the XSX does not have dedicated hardware for this purpose, so committing resources to DirectML reconstruction means taking them away from other rendering processes"

I replied with:

"We knew that many inference algorithms need only 8-bit and 4-bit integer positions for weights and the math operations involving those weights comprise the bulk of the performance overhead for those algorithms," says Andrew Goossen. "So we added special hardware support for this specific scenario. The result is that Series X offers 49 TOPS for 8-bit integer operations and 97 TOPS for 4-bit integer operations. Note that the weights are integers, so those are TOPS and not TFLOPs. The net result is that Series X offers unparalleled intelligence for machine learning." Link

Things should't be so complicated. There is no need to move the conversation to "Oh, but the XSX will also have to use shader cores, so the performance won't be the same as the tensor cores." I have already said that the tensor cores will have more performance, no need to go there as that was never the point. However you want to put it, the XSX does have dedicated hardware to accelerate machine learning code. The minute you admit to this, it invalidates what you said at first "the XSX does not have dedicated hardware for this purpose". It does have it and I have no other way to make this more clear than to show you someone from Microsoft itself saying that they added hardware for this purpose. This will be my last reply to you on this.

I think I see what's happened here: you misunderstood what we meant by "dedicated hardware". It doesn't refer simply to hardware that is suited for a particular purpose, such as the XSX's tweaked shader cores, but rather hardware that is expressly and singularly designed for and aimed at -- i.e. dedicated to -- that purpose, like the tensor cores on RTX GPUs. While developers can, of course, dedicate shader cores to DirectML operations (which I presume is the reason for the confusion), this doesn't make them dedicated hardware since they're a shared resource -- reserving a subset of them for DirectML means you can't use them elsewhere in the rendering process. This isn't a problem with the likes of tensor cores, as they're entirely their own thing -- that is, dedicated hardware.

At no point was anybody trying to say that the XSX GPU isn't designed to accelerate DirectML operations, just that it doesn't have hardware completely and utterly dedicated to this one specific purpose a la RTX GPUs and their tensor cores.

Edit: Perhaps Dictator's comment will help:
I think you are right on the money there. Tensor cores are not running concurrently with other work, so their advantage is that they are less die space for a lot more performance in that Machine Learning arena.

The RTX 2060 is using a small percentage of its entire GPU to enable more than 2x performance that XSX has for its entire GPU.
 
Last edited:
OP
OP
ILikeFeet

ILikeFeet

DF Deet Master
Banned
Oct 25, 2017
61,987
Elfotografoalocado Regarding our previous discussion on whether the tensor cores are being used fully: Alex mentions in the video (the 10:00 mark) that the DLSS performance hit compared to the same resolution with DLSS off is higher on the lower-end RTX2060 than on the RTX2080 Ti. I wonder if that is due to the Tensor cores being unable to produce the upscaling fast enough, and whether that shows that the Tensor cores are being stretched to their limit (i.e., that the algorithm does indeed need the full tensor compute power available).
doesn't seem to have anything to do with speed, but because it doesn't run concurrently (upscaling occurs right before the post-processing), there will always be a penalty


i7W51i9.png
 

lightchris

Member
Oct 27, 2017
680
Germany
I think it's very impressive what they are able to do with DLSS 2.0. Especially compared to their earlier efforts.


The DLSS version looks better then the native one.... and performs better... gg Nvidia.

Better than native is not something I agree with. Looking at that image there are just too many artifacts which makes it less pleasing for the eye (especially in motion). As an example:

dlssv7jup.jpg



In general the solution tends to oversharpen things a bit. This is something that could be tweaked though.

There are some pretty good comparisons at PCGames Hardware (German) too.
 

Zedark

Member
Oct 25, 2017
14,719
The Netherlands
doesn't seem to have anything to do with speed, but because it doesn't run concurrently (upscaling occurs right before the post-processing), there will always be a penalty


i7W51i9.png
That's interesting (though not unexpected when you think about it). What does that mean for the need for dedicated (separate) hardware, though? If the ALUs wait for DLSS 2.0 to finish, then they might as well be used as the engine behind the computation of the algorithm, right? Would that mean that the only difference between having the tensor cores and not having them is that the tensor cores provide more TOPS compute power? Or is there some other parallel task that the ALUs can go do?
 

Deleted member 22585

User requested account closure
Banned
Oct 28, 2017
4,519
EU
Wow seriously impressed. The 30xx series will be bonkers with these kinds of techniques alongside the increased power.
Can't wait.
 

Dictator

Digital Foundry
Verified
Oct 26, 2017
4,931
Berlin, 'SCHLAND
That's interesting (though not unexpected when you think about it). What does that mean for the need for dedicated (separate) hardware, though? If the ALUs wait for DLSS 2.0 to finish, then they might as well be used as the engine behind the computation of the algorithm, right? Would that mean that the only difference between having the tensor cores and not having them is that the tensor cores provide more TOPS compute power? Or is there some other parallel task that the ALUs can go do?
I think you are right on the money there. Tensor cores are not running concurrently with other work, so their advantage is that they are less die space for a lot more performance in that Machine Learning arena.

The RTX 2060 is using a small percentage of its entire GPU to enable more than 2x performance that XSX has for its entire GPU.
I think it's very impressive what they are able to do with DLSS 2.0. Especially compared to their earlier efforts.




Better than native is not something I agree with. Looking at that image there are just too many artifacts which makes it less pleasing for the eye (especially in motion). As an example:

dlssv7jup.jpg



In general the solution tends to oversharpen things a bit. This is something that could be tweaked though.

There are some pretty good comparisons at PCGames Hardware (German) too.

I 100% agree about the sharpness on the edges of objects, but I think inner surface detail is better than native for a surprsingly large portion of the image (since TAA is doing over-averaging and not representing the colours correctly). I think that the ability to unsharpen the image reconstruciton that the SDK version of DLSS has would be a great adition to make more people tailor it the way they like.

One thing DLSS does much better than native in every example from the GTC presentation and from my control video is how it deals with transparencies. It has much more stability in motion for transparencies and even renders them at higher detail. And if TAA is on, it will also not have ghosting which TAA does for transparency effects.
 

Zojirushi

Member
Oct 26, 2017
3,298
I think it's very impressive what they are able to do with DLSS 2.0. Especially compared to their earlier efforts.




Better than native is not something I agree with. Looking at that image there are just too many artifacts which makes it less pleasing for the eye (especially in motion). As an example:

dlssv7jup.jpg



In general the solution tends to oversharpen things a bit. This is something that could be tweaked though.

There are some pretty good comparisons at PCGames Hardware (German) too.

Can't you just use TAA on top of the DLSS to smoothen things out?
 

Raide

Banned
Oct 31, 2017
16,596
Interested to see the comparisons DF will do between this and the MS AI stuff.
 

Zedark

Member
Oct 25, 2017
14,719
The Netherlands
I think you are right on the money there. Tensor cores are not running concurrently with other work, so their advantage is that they are less die space for a lot more performance in that Machine Learning arena.

The RTX 2060 is using a small percentage of its entire GPU to enable more than 2x performance that XSX has for its entire GPU.
I see. In that case, the math behind the XSX and its using something like DLSS 2.0 gets very interesting. It would then start to depend on how quickly it can do the algorithm with its 49 TOPS of compute power: if that makes it so that rendering at native 4K is as fast as or faster than rendering at 1080p and then doing DLSS, then the XSX would not benefit from this technology. I guess it depends on how much time is needed to perform the DLSS 2.0 algorithm with 49 TOPS of compute power available, as opposed to 220 TOPS that the 2080 Ti has, and how much time is left within a frame quantum for native rendering.
 

Hey Please

Avenger
Oct 31, 2017
22,824
Not America
The portion of the video where we see Jesse walk in through the door of the bureau showed a lot of motion artifacting in the 540p -> 1080p DLSS conversion. The 1080p as base resolution worked wonders though (makes sense given the algorithm has more data to work with).

It will be very interesting to see whether any new and equivalent upscaling technique is seen over the next 7 years on next gen systems.
 

Fafalada

Member
Oct 27, 2017
3,067
4- Special hardware dedicated to run an specific type of code = dedicated hardware to run that code.
"Dedicated hw" goes into semantics and grammar.
People are trying to differentiate between discrete processing units (like Tensor cores) and extensions to the SIMD execution units to process 4/8/16 of 32/16/8 bit precision types in parallel.
To be more specific - PS2, XBox(original), PS3, and 360 all had hardware accelerated support for the latter - but it didn't get additional labels other than SIMD acceleration hardware.
 

Deleted member 46489

User requested account closure
Banned
Aug 7, 2018
1,979
We still do not know if PS5 supports increased rate int8.
XSX does.
XSX is like 49 int8 TOPs - RTX 2080 Ti is 220 at ca. 1500mhz (most 2080 Tis run at 1800-1950 mhz in real life though).
The RTX 2060 is using a small percentage of its entire GPU to enable more than 2x performance that XSX has for its entire GPU.
Based on what you've said, do you think that in case the XSX uses similar tech, it will be basically to run games in dynamic res and upscale them to 4K (instead of doing crazy jumps like 1080p to 4k)? That seems to make the most sense. It should allow them to maintain a steady fps without compromising on image quality.
 

Fafalada

Member
Oct 27, 2017
3,067
so XSX will support this and PS5 might but is unknown?
It hasn't been mentioned(accelerated int processing, not DLSS), though being a basic RDNA feature and PS adopting double rate FP16 with Pro, it seems more likely than not.
The one I'm more curious about is VRS, given MS has a patent for something there, and Pro had its own early approach that Sony has a patent for.
 

eebster

Banned
Nov 2, 2017
1,596
Is there even any point in buying an AMD GPU until they implement something that performs similiarly well? You'll have the 3060 performing better than the highest end AMD card. With better RT too probably