How is it high bandwidth? Google assistant tech converts your voice to text, matches a list of keywords, filters against what you may be interested in, and sends a concise ad term to Google
Well there are 2 flavors here really. iOS and Android. This is my understanding (sorry for long post)
iOS does a lot of local processing but it's not 100%. If the local model fails, it obviously sends the recording to the cloud for processing. It also does this for various chained commands. You can actually see this in real time on an iPhone if you use the microphone feature on the keyboard, certain words that it transcribes will actually change once you're done the transcription. Apple strips at all personally identifiable information from each Siri request or dictation request because they are not an ad company, they sell high margin hardware, so they don't give a fuck. It behooves them in both a perception sense- you can trust Apple with your data- and in a legal sense, where they don't want to be on the hook for all sorts of insane legal ramifications of having recordings of everyone with an iOS device.
Google is an ad company, and very very big on cloud processing as you know, so for them the solution is almost entirely sent for cloud processing (and I am assuming Alexa follows this model). So there is necessarily a recording uploaded. The very latest Pixel phones can do off-line voice processing but that's brand new on the latest sets and only in American English I believe. (Someone correct me if this is wrong.) Which means there's no local model beyond the simplified one that knows how to wake up if it hears OK Google. I'm pretty sure you can't use it in airplane mode, so there's your answer. They very much want that audio data and probably use it to scrape anonymous speech usage data the way they do with Gmail. But they just simply don't
need to listen all the damn time, risking the legal entanglements and infrastructure issues, to get what they want. They've already surpassed this giant hammer method before it even became an issue. The algorithms and the connections and the network effects do all the work. And so you notice this when it manifests, but it's not because they had to just constantly listen and upload. They just don't need that.
Think about pervasive recording usage... it would rack up peoples phone bills like mad, while draining their battery faster than usual. How long do you think your phone would last on pure record mode all the time? While uploading? (I can't speak to features like always on Shazam, that's new to me but it seems like a bad idea) and certainly, a $25 echo device is not doing offboard processing beyond the command prompt, at least I don't think ?!
Gods, think about the storage!
There are lots of ways to listen to a user if you just trick them into enabling the microphone. There's also a lot of crappy apps out there that have been busted for trying to pull stuff like this. But there's the key- they were busted. Because ultimately we can see the network traffic, the packet requests have destinations, even if they're small of course. Go ahead and be paranoid, it's warranted, shit is bad. But be smart about where they're getting this info on your personal life, before you stuff gum into your mic port (or like many in here, just throw up their hands and say fuck it they're recording, but i guess I must keep using my voice assistant)
(I use Siri for a lot of menial stuff, rarely web lookups, and anecdotally I have noticed a gigantic decrease in targeted ads I've encountered since doing the usual Firefox/DuckDuckGo + scripts shift with my browsing. And I don't use FB.)