In general the real benefit with RDNA is the flexibility.
Depending on the code Wave32 or Wave64 is better.
AMD since at least the R600 from 2007 used Wave64, while Nvidia uses Wave32.
Intel is much more flexible, they have (had) Wave4x2 (not anymore iirc) , Wave8, Wave16 and Wave32.
From the energy perspective it's more efficient to have one program pointer for 64 items and issuing 64 items vs. 2x32.
You pay less front-end overhead for larger Wavefronts than for smaller one.
There is also code where you might have less computing overhead with larger Wavefronts than with smaller ones.
On the other side large Wavefronts take up more registers and have a longer resource lifetime if the architecture isn't computing all elements at once.
Also branching costs are (much) higher with larger Wavefronts:
In the worst case only one thread of 64 might do something, so the unit utilization is at 1/64 while with Wave32 it is at least at 1/32.
With RDNA AMD has the choice to use either Wave mode depending on when it's more efficient in comparison to the other one.
In comparison to GCN RDNA brings some pros and cons.
For Wave64 the pro stuff is that AMD can change quicker to a new Wavefront, also the register-file per SIMD for one Wavefront is now 128KB large in contrast to 64KB.
A negative aspect is the latency, for dependent instructions RDNA needs 5 clock cycles, so you would need at least three Wave64 to mask the waiting time.
GCN can be occupied with just one Wave64.
But all of that is very low level stuff and depends on a lot of factors.
Especially for a layman it's hard to say how much of a benefit or disadvantage X vs. Y is in practise.