Nvidia decided that emerging tech that is giving huge profits, specifically hardware accelerated AI/deep learning, was where to put the new budget for adding transistors this gen. They had to dress it up a bit to make it more appealing to gamers, with obviously mixed to disappointing results thus far.
Each gen with time for engineers to improve efficiency, and use new process tech to increase number of transistors per SKU, you can get a certain predictable increase in raw performance, although you can swing high or low depending on other factors. Say you have to go with a large die and aggressive clocks one gen to maintain competitive footing. Then a process shrink comes along and you have less pressure from the competition. You can choose to do a more direct die shrink without a huge increase in transistor count. Boom, you have a more modest performance leap but a cooler/less power hungry lineup, and more dies per wafer. Alternatively you can go big in the other direction, which makes things more expensive, lower yields, and tougher to power and cool. But maximizes potential performance.
A hypothetical 2060/2070/2080/2080ti with all transistors put towards traditional architecture would have meant a very large uplift in performance, but something like 40% of the die is tensor/RT stuff. They ARE expensive for what you get, especially considering that the RTX portions are useless in most consumer scenarios, but they actually are pretty huge die size and transistor counts. Pascals were pretty small and efficient at each level. Turing does are enormous by comparison, just with almost all of the additional die shrink transistor budget spent on stuff that doesn't get used all that much.
It's actually kind of impressive that it wasn't more of a disaster than it was, but that's the basics of why we saw what we did with 10xx to 20xx.
The good news is that the next gen should see back to normal increases from new process tech and improvements in memory speed and architecture tuning, assuming an equal % of resources used for tensor and RT portions.
Alternatively, a hypothetical 30xx series that also fully abandoned RT/Tensor would see monumental uplift in performance, but that is probably at roughly 0% chance of happening.
Personally, I feel that it was too early to dedicate die space to tensor RT stuff. Even my 2080ti is only meh at RT applications, so it feels like a bit of a waste. But, the gen that started implementation would always be the roughest one, so at least that's out of the way.
5 or 6nm EUV and 2020+ we should see a substantial improvement in performance across the board. Whether or not that means that raytracing or especially DLSS amount to anything worthwhile remains to be seen however.