From 80830a41a7b38daf2d34db238dd6b53d2bb47b77 Mon Sep 17 00:00:00 2001
From: Javier Martinez <javiermartinezalvarez98@gmail.com>
Date: Wed, 7 Aug 2024 11:19:52 +0200
Subject: [PATCH] docs: add numpy issue to troubleshooting

---
 fern/docs/pages/installation/installation.mdx   | 10 ++++++----
 .../docs/pages/installation/troubleshooting.mdx | 17 ++++++++++++++++-
 2 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/fern/docs/pages/installation/installation.mdx b/fern/docs/pages/installation/installation.mdx
index f7457b34b5..ce8decab02 100644
--- a/fern/docs/pages/installation/installation.mdx
+++ b/fern/docs/pages/installation/installation.mdx
@@ -307,11 +307,12 @@ If you have all required dependencies properly configured running the
 following powershell command should succeed.
 
 ```powershell
-$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
+$env:CMAKE_ARGS='-DLLAMA_CUBLAS=on'; poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
 ```
 
 If your installation was correct, you should see a message similar to the following next
-time you start the server `BLAS = 1`.
+time you start the server `BLAS = 1`. If there is some issue, please refer to the
+[troubleshooting](#/installation/getting-started/troubleshooting#guide-for-building-llama-cpp-with-cuda-support) section.
 
 ```console
 llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
@@ -339,11 +340,12 @@ Some tips:
 After that running the following command in the repository will install llama.cpp with GPU support:
 
 ```bash
-CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
+CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python numpy==1.26.0
 ```
 
 If your installation was correct, you should see a message similar to the following next
-time you start the server `BLAS = 1`.
+time you start the server `BLAS = 1`. If there is some issue, please refer to the
+[troubleshooting](#/installation/getting-started/troubleshooting#guide-for-building-llama-cpp-with-cuda-support) section.
 
 ```
 llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB)
diff --git a/fern/docs/pages/installation/troubleshooting.mdx b/fern/docs/pages/installation/troubleshooting.mdx
index dc99d6cb5b..0b72526d2a 100644
--- a/fern/docs/pages/installation/troubleshooting.mdx
+++ b/fern/docs/pages/installation/troubleshooting.mdx
@@ -46,4 +46,19 @@ huggingface:
 embedding:
   embed_dim: 384
 ```
-</Callout>
\ No newline at end of file
+</Callout>
+
+# Building Llama-cpp with NVIDIA GPU support
+
+## Out-of-memory error
+
+If you encounter an out-of-memory error while running `llama-cpp` with CUDA, you can try the following steps to resolve the issue:
+1. **Set the next environment:**
+    ```bash
+    TOKENIZERS_PARALLELISM=true
+    ```
+2. **Run PrivateGPT:**
+    ```bash
+    poetry run python -m privategpt
+    ```
+Give thanks to [MarioRossiGithub](https://github.com/MarioRossiGithub) for providing the following solution.
\ No newline at end of file