You can buy them new for somewhat reasonable prices. What people should really look at is used 1080ti's on ebay. They're going for less than $150 and still play plenty of games perfectly fine. It's the budget PC gaming deal of the century.
not in my country lol. getting used cards were already the norm before, for a while you could literally only get used ones for a good price on aliexpress.
and now our gvmnt imposed 100% tax on anything from china, so its really just not affordable.
Meanwhile I dont think I have played more than 30minutes on my ps5 this year and its june, and I have definitely not played any minutes on the 1080 sitting in my PC...
Oh fuck scratch that I may have played about 2 hours of Dune Spice Wars
I think its largely the chip manufacturers, but ARM is still making money on licensing fees for Nvidia's new ai chip (with an integrated 72 core arm cpu) for example
ARM is in the perfect place where, if a company using their architecture succeeds, they get tons of money, and if the company fails, they lose nothing.
I mean if LLM/Diffusion type AI is a dead-end and the extra investment happening now doesn't lead anywhere beyond that. Yes, likely the bubble will burst.
But, this kind of investment could create something else. We'll see. I'm 50/50 on the potential of it myself. I think it's more likely a lot of loud talking con artists will soak up all the investment and deliver nothing.
bubbles have nothing to do with technology, the tech is just a tool to build the hype. The bubble will burst regardless of the success of the tech at most success will slightly delay the burst, because what is bursting isnt the tech its the financial structures around it.
It's looking like a dead end. The content that can be fed into the big LLMs has already been done. New stuff is a combination of actual humans and stuff generated by LLMs. It then runs into an ouroboros problem where it just eats its own input.
Unfortunately you've completely switched your entire economy to shovels for this thing and also all your warehouses are full of shovels for this thing and most of your assets are tied up in shovels for this thing. And now suddenly, this thing stops.
I doubt it. Regardless of the current stage of machine learning, everyone is now tuned in and pushing the tech. Even if LLMs turn out to be mostly a dead end, everyone investing in ML means that the ability to do LOTS of floating point math very quickly without the heaviness of CPU operations isn’t going away any time soon. Which means nVidia is sitting pretty.
See Sun Microsystems after the .com bubble burst. They produced a lot of the servers that .com companies were using at the time. Shriveled up after and were eventually absorbed by Oracle.
Why did Oracle survive the same time? Because they latched onto a traditional Fortune 500 market and never let go down to this day.
As far as I understand, the GPUs that LLMs use aren't exactly interchangeable with your regular GPU. Also, no one needs that many GPUs for any traditional use cases.
Worst one is probably Apple. They just announced "Apple Intelligence" which is just ChatGTP whose largest shareholder is Microsoft. Figure that one out.
Well, most of the requests are handled on device with their own models. If it’s going to ChatGPT for something it will ask for permission and then use ChatGPT.
So the Apple Intelligence isn’t all ChatGPT. I think this deserves to be mentioned as a lot of the processing will be on device.
Also, I believe part of the deal is ChatGPT can save nothing and Apple are anonymising the requests too.
Not true. Most if not all requests are handled by apples own models on device or on their own servers. When it does use OpenAI you need to give it permission each time it does.
That's just not true. Most requests are handled on-device. If the system decides a request should go to ChatGPT, the user is promped to agree and no data is stored on OpenAI's servers. Plus, all of this is opt-in.
I don't think it's a problem, more like a situation. You are not doing anything wrong or stupid, just interested in something new and promising and have the resources to pursue it. Good for you, may you find gold.
Getting ROCM to work properly is like herding cats.
You need a custom implementation for the specific operating system, the driver version must be locked and compatible, especially with a Workstation / WRX card, the Pro drivers are especially prone to breaking, you need the specific dependencies to be compiled for your variant of HIPBlas, or zLUDA, if that doesn't work, you need ONNX transition graphs, but then find out PyTorch doesn't support ONNX unless it's 1.2.0 which breaks another dependency of X-Transformers, which then breaks because the version of HIPBlas is incompatible with that older version of Python and ..
Inhales
And THEN MAYBE it'll work at 85% of the speed of CUDA. If it doesn't crash first due to an arbitrary error such as CUDA_UNIMPEMENTED_FUNCTION_HALF
You get the picture. On Nvidia, it's click, open, CUDA working? Yes?, done. You don't spend 120 hours fucking around and recompiling for your specific usecase.
Also, you need a supported card. I have a potato going by the name RX 5500, not on the supported list. I have the choice between three rocm versions:
An age-old prebuilt, generally works, occasionally crashes the graphics driver, unrecoverably so... Linux tries to re-initialise everything but that fails, it needs a proper reset. I do need to tell it to pretend I have a different card.
A custom-built one, which I fished out of a docker image I found on the net because I can't be arsed to build that behemoth. It's dog-slow, due to using all generic code and no specialised kernels.
A newer prebuilt, any. Works fine for some, or should I say, very few workloads (mostly just BLAS stuff), otherwise it simply hangs. Presumably because they updated the kernels and now they're using instructions that my card doesn't have.
#1 is what I'm actually using. I can deal with a random crash every other day to every other week or so.
It really would not take much work for them to have a fourth version: One that's not "supported-supported" but "we're making sure this things runs": Current rocm code, use kernels you write for other cards if they happen to work, generic code otherwise.
Seriously, rocm is making me consider Intel cards. Price/performance is decent, plenty of VRAM (at least for its class), and apparently their API support is actually great. I don't need cuda or rocm after all what I need is pytorch.
I think it's in the pipeline. AMD has bought Xilinx, which builds FPGAs and already had some AI specific cores in their processors. I believe they're developing that further and integrating it in their GPUs now.