-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] GenAI ROCm #3411
Comments
Hi @robertgshaw2-neuralmagic We are currently working on enabling ROCm for the GenAI variant of the build. To my knowledge, the GenAI section of the codebase depends on code that exists in the I will work with the team to see if we can work around this limitation, otherwise we are blocked from making the code build on OSS until some of those changes in the |
I see. Thanks for the clear and quick response! I will check back in for a few days |
Hey @q10 - I was wondering if you had any updates |
@shajrawi @gshtras @sunway513 for vLLM support on ROCm. |
For FP8 GEMM compute performance + Llama my current recommendation is ROCm 6.3 because the math library / hipblaslt has a lot of improvements for GEMM performance + PyTorch nightly for scaled_mm and tunable ops improvements. For the ROCm 6.2 question, I would recommend this docker that builds newer hipblaslt library + PyTorch https://github.com/ROCm/vllm/blob/main/Dockerfile.rocm |
|
@robertgshaw2-neuralmagic |
Great - can you point me to an example of this API? |
hipBLASLt/clients/samples/15_gemm_scale_a_b_ext/sample_hipblaslt_gemm_with_scale_a_b_ext.cpp |
Does this get exposed via PyTorch already or should I use it directly? |
You have to use it directly. |
@q10 I managed to get down to till compiling the last 3 HIPCC object.
The error logs are as follows:
|
Hello!
I work on the
vllm-project
. I worked in the past with FBGEMM for Llama-405B launch in VLLM. As part of our 2025 roadmap planning, we are evaluating options for Fp8 compute on ROCm. I noticed several PRs in the V1 release (https://github.com/pytorch/FBGEMM/releases/tag/v1.0.0) include Fp8 GEMM support and the documentation suggests RoCM is a target for the kernels. However, I also noticed that the RoCM builds in theCMakeLists.txt
(https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/CMakeLists.txt#L195) skips the genai compilation.I also spent a few hours trying to get things to build for RoCM following (https://pytorch.org/FBGEMM/fbgemm_gpu-development/BuildInstructions.html) using
rocm/rocm-terminal:6.2.0
androcm/dev-ubuntu-22.04:6.2.2
but was unsuccessfulI have a couple questions:
Thanks!
The text was updated successfully, but these errors were encountered: