How do I get LLaMA going on a GPU?
Everyone is so thrilled with llama.cpp, but I want to do GPU accelerated text generation and interactive writing. What's the state of the art here? Will KoboldAI now download LLaMA for me?
You're viewing a single thread.
there's a bit more setup involved but I would look into https://github.com/oobabooga/text-generation-webui