Merge pull request #196 from asierarranz/main

Added ComfyUI + Flux for Jetson AGX Orin
NVIDIA-AI-IOT · Aug 19, 2024 · 4508ba3 · 4508ba3
2 parents 4baced8 + 1828855
commit 4508ba3
Show file tree

Hide file tree

Showing 9 changed files with 190 additions and 32 deletions.
diff --git a/docs/images/flux0.png b/docs/images/flux0.png
diff --git a/docs/images/flux1.png b/docs/images/flux1.png
diff --git a/docs/images/flux2.png b/docs/images/flux2.png
diff --git a/docs/images/flux3.png b/docs/images/flux3.png
diff --git a/docs/images/flux4.png b/docs/images/flux4.png
diff --git a/docs/images/flux5.png b/docs/images/flux5.png
diff --git a/docs/tutorial-intro.md b/docs/tutorial-intro.md
@@ -36,6 +36,24 @@ Give your locally running LLM an access to vision!
 | **[SAM](./vit/tutorial_sam.md)** | Meta's [SAM](https://github.com/facebookresearch/segment-anything), Segment Anything model |
 | **[TAM](./vit/tutorial_tam.md)** | [TAM](https://github.com/gaomingqi/Track-Anything), Track-Anything model, is an interactive tool for video object tracking and segmentation |
 
+### Image Generation
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[Flux + ComfyUI](./tutorial_comfyui_flux.md)** | Set up and run the ComfyUI with Flux model for image generation on Jetson Orin. |
+| **[Stable Diffusion](./tutorial_stable-diffusion.md)** | Run AUTOMATIC1111's [`stable-diffusion-webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) to generate images from prompts |
+| **[Stable Diffusion XL](./tutorial_stable-diffusion-xl.md)** | A newer ensemble pipeline consisting of a base model and refiner that results in significantly enhanced and detailed image generation capabilities. |
+
+
+### Audio
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[Whisper](./tutorial_whisper.md)** | OpenAI's [Whisper](https://github.com/openai/whisper), pre-trained model for automatic speech recognition (ASR) |
+| **[AudioCraft](./tutorial_audiocraft.md)** | Meta's [AudioCraft](https://github.com/facebookresearch/audiocraft), to produce high-quality audio and music |
+| **[Voicecraft](./tutorial_voicecraft.md)** | Interactive speech editing and zero shot TTS |
+
+
 ### RAG & Vector Database
 
 |      |                     |
@@ -55,21 +73,6 @@ Give your locally running LLM an access to vision!
 | **[Gapi Micro Services](./tutorial_gapi_microservices.md)** | Wrapping models and code to participate in systems |
 | **[Ultralytics YOLOv8](./tutorial_ultralytics.md)** | Run [Ultralytics](https://www.ultralytics.com) YOLOv8 on Jetson with NVIDIA TensorRT. |
 
-### Audio
-
-|      |                     |
-| :---------- | :----------------------------------- |
-| **[Whisper](./tutorial_whisper.md)** | OpenAI's [Whisper](https://github.com/openai/whisper), pre-trained model for automatic speech recognition (ASR) |
-| **[AudioCraft](./tutorial_audiocraft.md)** | Meta's [AudioCraft](https://github.com/facebookresearch/audiocraft), to produce high-quality audio and music |
-| **[Voicecraft](./tutorial_voicecraft.md)** | Interactive speech editing and zero shot TTS |
-
-### Image Generation
-
-|      |                     |
-| :---------- | :----------------------------------- |
-| **[Stable Diffusion](./tutorial_stable-diffusion.md)** | Run AUTOMATIC1111's [`stable-diffusion-webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) to generate images from prompts |
-| **[Stable Diffusion XL](./tutorial_stable-diffusion-xl.md)** | A newer ensemble pipeline consisting of a base model and refiner that results in significantly enhanced and detailed image generation capabilities.|
-
 ## About NVIDIA Jetson
 
 !!! note

diff --git a/docs/tutorial_comfyui_flux.md b/docs/tutorial_comfyui_flux.md
@@ -0,0 +1,154 @@
+# ComfyUI and Flux on Jetson Orin
+
+Hey there, fellow developer! 👋 I'm excited to share with you our latest project: **Flux**, an open-source model for image generation. Here at NVIDIA, we're pushing the boundaries to make Flux work seamlessly across all platforms, including our Jetson Orin devices. While we're still fine-tuning the model for the Jetson Orin Nano, we've already got it running smoothly on the Jetson AGX Orin.
+
+In this tutorial, I'm going to walk you through every step needed to get Flux up and running on your Jetson Orin, even if you've just flashed your system. Follow along, and you should have no trouble getting everything set up. And hey, if something doesn't work out, reach out to me—I’ll keep this guide updated to make sure it's always on point.
+
+![Alt text](./images/flux5.png)
+
+So, let's dive in and get Flux running on your Jetson!
+
+## 1. Install Miniconda and Create a Python 3.10 Environment
+
+First things first, you'll need to install Miniconda on your Jetson Orin and create a Python 3.10 environment called `comfyui`. This will ensure all dependencies are handled properly within an isolated environment.
+
+```sh
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh
+chmod +x Miniconda3-latest-Linux-aarch64.sh
+./Miniconda3-latest-Linux-aarch64.sh
+
+conda update conda
+
+conda create -n comfyui python=3.10
+conda activate comfyui
+```
+
+## 2. Install CUDA, cuDNN, and TensorRT
+
+Once your environment is set up, install CUDA 12.4 along with the necessary cuDNN and TensorRT libraries to ensure compatibility and optimal performance on your Jetson Orin.
+
+```sh
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb
+sudo dpkg -i cuda-keyring_1.1-1_all.deb
+sudo apt-get update
+sudo apt-get -y install cuda-toolkit-12-4 cuda-compat-12-4
+sudo apt-get install cudnn python3-libnvinfer python3-libnvinfer-dev tensorrt
+```
+
+## 3. Verify and Configure CUDA
+
+After installing CUDA, you'll want to verify that the correct version (12.4) is being used and make this change permanent in your environment.
+
+```sh
+ls -l /usr/local | grep cuda
+sudo ln -s /usr/local/cuda-12.4 /usr/local/cuda
+
+export PATH=/usr/local/cuda/bin:$PATH
+nvcc --version
+
+echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
+echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc
+echo 'export CUDA_PATH=/usr/local/cuda' >> ~/.bashrc
+source ~/.bashrc
+```
+
+## 4. Compile and Install `bitsandbytes` with CUDA Support
+
+Now it’s time to compile and install `bitsandbytes` with CUDA support. This involves cloning the repository, configuring the build with CMake, compiling using all available cores, and installing the resulting package.
+
+```sh
+export BNB_CUDA_VERSION=124
+export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:$LD_LIBRARY_PATH
+
+git clone https://github.com/timdettmers/bitsandbytes.git
+cd bitsandbytes
+
+mkdir -p build
+cd build
+cmake .. -DCOMPUTE_BACKEND=cuda -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.4
+make -j$(nproc)
+
+cd ..
+python setup.py install
+```
+
+Verify the installation by importing the package in Python:
+
+```sh
+python
+>>> import bitsandbytes as bnb
+>>> print(bnb.__version__)
+```
+
+## 5. Install PyTorch, TorchVision, and TorchAudio
+
+Next up, install the essential libraries `PyTorch`, `torchvision`, and `torchaudio` for Jetson Orin. You can always check for the latest links [here](http://jetson.webredirect.org/jp6/cu124).
+
+```sh
+pip install http://jetson.webredirect.org/jp6/cu124/+f/5fe/ee5f5d1a75229/torch-2.3.0-cp310-cp310-linux_aarch64.whl
+pip install http://jetson.webredirect.org/jp6/cu124/+f/988/cb71323efff87/torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl
+pip install http://jetson.webredirect.org/jp6/cu124/+f/0aa/a066463c02b4a/torchaudio-2.3.0+952ea74-cp310-cp310-linux_aarch64.whl
+```
+
+## 6. Clone the ComfyUI Repository
+
+Clone the ComfyUI repository from GitHub to get the necessary source code.
+
+```sh
+git clone https://github.com/comfyanonymous/ComfyUI.git
+cd ComfyUI
+```
+
+## 7. Update Dependencies
+
+Make sure all the necessary dependencies are installed by running the `requirements.txt` file.
+
+```sh
+pip install -r requirements.txt
+```
+
+## 8. Resolve Issues with NumPy
+
+If you encounter issues with NumPy, downgrade to a version below 2.0 to avoid compatibility problems.
+
+```sh
+pip install "numpy<2"
+```
+
+## 9. Run ComfyUI
+
+Finally, run ComfyUI to ensure everything is set up correctly.
+
+```sh
+python main.py
+```
+
+
+
+Great! Now that you’ve got ComfyUI up and running, let's load the workflow to start using the Flux model. 
+
+* Download the workflow file using [this link](./assets/workflow_agx_orin_4steps.json). And load it from the ComfyUI interface.
+* You’ll need to download the Flux Schnell model `flux1-schnell.safetensors` and vae `ae.safetensors` from [Hugging Face](https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/main) and place the model in the `models/unet` folder and vae in `models/vae` within ComfyUI.
+* Download `clip_l.safetensors` and `t5xxl_fp8_e4m3fn.safetensors` from [Stability's Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-3-medium/tree/main/text_encoders) and place them inside `models/clip` folder.
+
+
+Alright, you're all set to launch your first run! Head over to the URL provided by ComfyUI ([127.0.0.1:8188](http://127.0.0.1:8188)) on your Jetson AGX Orin, and hit that **Queue Prompt** button. The first time might take a little longer as the model loads, but after that, each generation should take around 21 seconds. Plus, you can queue up multiple prompts and let it generate images for hours!!
+
+Happy generating! 🎉
+
+**ASIER** 🚀
+
+*Some examples:* 
+![Alt text](./images/flux2.png)
+
+
+![Alt text](./images/flux1.png)
+![Alt text](./images/flux0.png)
+![Alt text](./images/flux3.png)
+![Alt text](./images/flux4.png)
+
+
+
+
+
+
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -76,47 +76,48 @@ extra_css:
   - css/nvidia-font.css
 
 nav:
-  - Home: index.md
-  - Tutorials:
-    - Introduction: tutorial-intro.md
-    - Hello AI World: hello_ai_world.md
-    - Agent Studio: agent_studio.md
-    - Text (LLM):
+  - 🏠 Home: index.md
+  - 📚 Tutorials:
+    - ✨ Introduction: tutorial-intro.md
+    - 👋 Hello AI World: hello_ai_world.md
+    - 🤖 Agent Studio: agent_studio.md
+    - 📝 Text (LLM):
       - text-generation-webui: tutorial_text-generation.md 
       - ollama: tutorial_ollama.md
       - llamaspeak: tutorial_llamaspeak.md
       - NanoLLM: tutorial_nano-llm.md
       - Small LLM (SLM): tutorial_slm.md
       - API Examples: tutorial_api-examples.md
-    - Text + Vision (VLM):
+    - 👁️‍🗨️ Text + Vision (VLM):
       - LLaVA: tutorial_llava.md
       - Live LLaVA: tutorial_live-llava.md
       - NanoVLM: tutorial_nano-vlm.md
-    - Vision Transformers (ViT): 
+    - 🔍 Vision Transformers (ViT): 
       - vit/index.md
       - EfficientViT: vit/tutorial_efficientvit.md
       - NanoOWL: vit/tutorial_nanoowl.md
       - NanoSAM: vit/tutorial_nanosam.md
       - SAM: vit/tutorial_sam.md
       - TAM: vit/tutorial_tam.md
-    - RAG & Vector Database:
+    - 🎨 Image Generation:
+      - Flux & ComfyUI: tutorial_comfyui_flux.md
+      - Stable Diffusion: tutorial_stable-diffusion.md
+      - Stable Diffusion XL: tutorial_stable-diffusion-xl.md
+    - 🗄️ RAG & Vector Database:
       - NanoDB: tutorial_nanodb.md
       - LlamaIndex: tutorial_llamaindex.md
       - Jetson Copilot: tutorial_jetson-copilot.md
-    - API Integrations 🆕:
+    - 🧩 API Integrations 🆕:
       - ROS2 Nodes: ros.md
       - Holoscan SDK: tutorial_holoscan.md
       - Jetson Platform Services: tutorial_jps.md
       - Gapi Workflows: tutorial_gapi_workflows.md
       - Gapi Micro Services: tutorial_gapi_microservices.md
       - Ultralytics YOLOv8: tutorial_ultralytics.md
-    - Audio:
+    - 🎵 Audio:
       - Whisper: tutorial_whisper.md
       - AudioCraft: tutorial_audiocraft.md
       - VoiceCraft: tutorial_voicecraft.md
-    - Image Generation:
-      - Stable Diffusion: tutorial_stable-diffusion.md
-      - Stable Diffusion XL: tutorial_stable-diffusion-xl.md
     #- Metropolis Microservices:
     #  - First Steps: tutorial_mmj.md
     # - Tools:
@@ -126,9 +127,9 @@ nav:
       - 🔖 SSD + Docker: tips_ssd-docker.md
       - 🔖 Memory optimization: tips_ram-optimization.md
       - 🚅 Initial Setup Guide - Jetson Orin Nano: initial_setup_jon.md
-  - Benchmarks: benchmarks.md
-  - Projects: community_articles.md
-  - Research Group: research.md
+  - 🏁 Benchmarks: benchmarks.md
+  - 🛠️ Projects: community_articles.md
+  - 🧠 Research Group: research.md
   #- Try: try.md
 
 extra: