Is it true that the X1 is not capable of parallel and asynchronous processing?, Pascal minimum confirmed?
https://youtu.be/l5WxKUYglns
https://careers.nintendo.com/job-openings/listing/1900000034.html
This doesn't make any reference to
GPU async compute, just "parallel and asynchronous programming", which I'd say is pretty desirable for a GPU driver engineer regardless of whether the GPU itself supports async compute (particularly if you're using a Vulkan-based API which doesn't restrict as much to a single thread as, say OpenGL or DX11).
They can't fit RT cores in a handheld packages without big concessions
They can, just not enough to do anything useful. Which is a shame, as ray tracing is far better suited to a console development environment where devs can rely on the hardware being there, and tightly optimizing around the configuration they've got. We'll see a Nintendo device with ray tracing at some point down the line, but not any time soon.
Wii U did it actually, but because those consoles built their code on a low level just above the metal, they required a huge investment and custom hardware. You can drop Turing into a Switch APU with a newer ARM CPU and have no issues at all, because NVN is based on a hardware agnostic API.
NVN may be based on a hardware-agnostic API, but there could be a lot of hardware-specific extensions built on top of that, some of which may be specific to the Maxwell architecture. Alternatively there could be parts of game engines which are tightly optimized around particular aspects of the Maxwell architecture, and although Turing may retain compatibility, it couldn't necessarily guarantee performance parity for code optimised for Maxwell (as an example, the way threads are scheduled and dispatched changed significantly from Maxwell/Pascal to Volta/Turing).
I'm mainly just playing devil's advocate here, and from how Nintendo has approached Switch thus far I feel they've put more thought into forward-compatibility than they would have in the past, so hopefully this shouldn't be an issue.
That's been my thinking for a while, Gaming Tegra and AI Tegra, even joked about calling this Tegra chip X3.
As I see it Nvidia has two SoC lines now, Nintendo Tegra and automotive SoCs (note that they don't actually use the Tegra name for these anymore). The automotive ones are in effect the AI chips, as they went heavy on tensor cores with Xavier, and I would be very surprised if they don't do the same with Orin.
I don't see Nvidia releasing any new SoCs any time soon that aren't destined for either Nintendo or the automotive world. The real question is whether they stop going after the tablet/HMD/etc. market that the X1 and X2 were targeted at, or the new Nintendo chip doubles as an "X3" which they can sell to other customers. If so, then it may well affect the chip Nintendo ends up with, versus one which is designed exclusively for Nintendo.
Very crazy. System revisions tend to ditch BC in later revisions to cut costs and because it's no longer seen as important. PS3, Wii, GBA, DS. Adding it later is I think unheard of? Especially if it's a new, significantly cheaper model.
Ditching BC usually happens when it's implemented with dedicated hardware, but with emulation, a more powerful system can enable BC that wouldn't have been possible before without adding costs, such as SNES games running on the n3DS. Of course, as we've only got NES games available on Switch thus far it doesn't feel like emulation performance is what's holding Nintendo back from bringing their back-catalog to the console.
Larger jumps than what they've done has always certainly been possible. Nintendo revisions tend to focus on feature upgrades and the architecture revisions are always in service to that primairily. I'm not expecting Mariko to be radically different and my guess would be that job listing also entails engineering for future devices. Turing is more likely for Switch 2 than for Switch Plus/Mini imo.
This has been true in the past (eg super-stable 3D on the n3DS), and I have been wondering if there's a particular feature coming to the new Switch which would require a bit of extra performance, but I don't think we should necessarily assume that new Switch devices have to follow Nintendo's past patterns, when the Switch itself didn't follow Nintendo's past patterns. I've said in the past (long before the Switch was revealed) that I expected Nintendo to drop the generational upgrade pattern, and I think that's what's happening here. Instead of one big change every 5-6 years, we're seeing smaller, incremental changes every 2-3 years, allowing Nintendo to maintain the same software ecosystem and customer base over a long period, without the existential risk that comes with a generational refresh. Nintendo won't pursue power for power's sake, but they may push an incremental boost in power now rather than a compatibility-breaking big jump in power in 3 years time.
If they were compelled to use 7nm, I think they could have a monster device, but from everything we know and what they used in the overclocked Tegra X2, they aren't doing that.
To answer your question honestly though, the max Nvidia could do for $299 is a 512cuda core gpu at 1.5ghz for 1.5TFLOPs, with 6 core A73 at 2ghz and 4 core A53 at 2ghz, With 8GB 137GB/s RAM.
My highest expectations are roughly half of that, simply because Nintendo isn't creating a new platform and Switch is heavily discussed as Nintendo's final platform, only getting iterative upgrades and different form factors, but unlikely to be replaced for at least this next decade.
Another thing holding back this model is that the Switch Mini likely will use the same chip, they are going with 12nm because it's the cheapest process node they can use and achieve the best performance possible, it's also why I went with A73. These are the cheapest upgrades they can do, they just also happen to be the most efficient and a nice upgrade from the current model, it's also Turing that fills this solution objective too, cheaper to use a architecture already on a process node than to move an existing architecture to a new process node.
LDPPR4(X) with 137GB/s in Switch's form-factor just isn't feasible, as it would require 4 RAM chips, and I doubt they'd be able to squeeze them in (not to mention I can't see them doing that inside $299). If you really wanted loads of bandwidth you'd have to go with LPDDR5 (if it's available in time) or something crazy like HBM.