Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add non-persistent fp8 triton_rowwise kernel (#2484)
Summary: Pull Request resolved: #2484 X-link: pytorch/FBGEMM#3212 X-link: facebookresearch/FBGEMM#308 triton_rowwise persistent kernel performs poorly on MI300 compared to the non-persistent kernel, when both are run with exhaustive AMD-specific tuning. Reviewed By: htyu Differential Revision: D63741099 fbshipit-source-id: c276415ddf8f5d24ffeba70b8ee6493011b393e1
- Loading branch information