Oh yeah for sure, I've run Llama 3.2 on my RTX 4080 and it struggles but it's not obnoxiously slow. I think they are betting more software will ship with integrated LLMs that run locally on users PCs instead of relying on cloud compute.
Data centres want the even beefier cards anyhow, but I think nVidia envisions everyone running local LLMs on their PCs because it will be integrated into software instead of relying on cloud compute. My RTX 4080 can struggle through Llama 3.2.