Advertisement

Install LlamaCPP for AMD HIP/ROCm on Linux

LlamaCPP logo generated with AI. Learn how to install LlamaCPP for you AMD GPU powered by HIP/ROCm software.

ChatGPT is cool, you can chat and automate some of your boring work. However, it is far from safe in terms of privacy. All your data you feed to the models will be used by the companies in a lot of way. The good news is that you can run llm (large language model) directly on your computer, especially if you have a strong GPU. There are a lot of way to run llm on your computer but the most popular is surely LlamaCPP. In this guide we will see how to install LlamaCPP for your AMD GPU and accelerate models with the power of HIP/ROCm software.


Prerequisite: Installing ROCm on your computer

To actually use LLamaCPP you need to install ROCm on your computer. There are numerous way to do it. I suggest you to check this post where I explain how to install ROCm. It’s not difficult, give it a try! I also point out that this guide is expressly aimed to Linux not Windows.


Installing LlamaCPP for AMD HIP/ROCm

Let’s start right away with the installation. If you followed my previous tutorial on installing ROCm you will have to open a terminal and enter the container with the following command:

distrobox enter almalinux-rocm

And activate gcc-10:

scl enable gcc-toolset-10 bash

If you have installed ROCm in other way keep going with the guide. Let’s clone the repository. With the following command you will create the directory that store the program:

git clone https://github.com/ggml-org/llama.cpp.git \
&& cd llama.cpp

After that you have to compile it. The command is different depending on the GPU you use.

If you have an RDNA 2 GPU (Rx 6000):

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
  cmake -S . -B build -DGGML_HIP=ON -DLLAMA_CURL=OFF -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
  && cmake --build build --config Release -- -j$(nproc)

If you have an RDNA 3 GPU (Rx 7000):

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
  cmake -S . -B build -DGGML_HIP=ON -DLLAMA_CURL=OFF -DGGML_HIP_ROCWMMA_FATTN=ON -DAMDGPU_TARGETS=gfx1100 -DCMAKE_BUILD_TYPE=Release \
  && cmake --build build --config Release -- -j$(nproc)

If you have an Rx 9070:

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
  cmake -S . -B build -DGGML_HIP=ON -DLLAMA_CURL=OFF -DGGML_HIP_ROCWMMA_FATTN=ON -DAMDGPU_TARGETS=gfx1201 -DGGML_HIP_FORCE_ROCWMMA_FATTN_GFX12=ON -DCMAKE_BUILD_TYPE=Release \
  && cmake --build build --config Release -- -j$(nproc)

If you have an Rx 9060:

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
  cmake -S . -B build -DGGML_HIP=ON -DLLAMA_CURL=OFF -DGGML_HIP_ROCWMMA_FATTN=ON -DGGML_HIP_FORCE_ROCWMMA_FATTN_GFX12=ON -DAMDGPU_TARGETS=gfx1200 -DCMAKE_BUILD_TYPE=Release \
  && cmake --build build --config Release -- -j$(nproc)

To update LlamaCPP run the following commands in a terminal in the directory llama.cpp. Remember to enter the container before doing so if you have it.

git pull \
&& rm -r build

And after that you have to repeat compilation procedure.

And with that we have finished the installation of LlamaCPP for your AMD GPU!


What to do next

Using the program is beyond the scope of this tutorial, if you need more information on how to use LlamaCPP visit the official site. I suggest you to also see some llm front-end, such as Open WebUI, GPT4All and GPT4Free. Be careful, however, that you must install all the python packages needed by the frontends in a virtual environment where there is PyTorch installed with HIP/ROCm support. You can find how to install correctly PyTorch for your AMD GPU on this tutorial, don’t worry, it’s super easy.


But I want to use KoboldCPP

LlamaCPP is beautiful but KoboldCPP is better in some user cases. KoboldCPP is based on LlamaCPP so it can be easily installed for an AMD GPU. For more information check out this tutorial:



Leave a Reply

Your email address will not be published. Required fields are marked *