Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multiple ops support for --op argument #2490

Closed
wants to merge 2 commits into from

Conversation

FindHao
Copy link
Member

@FindHao FindHao commented Oct 3, 2024

Allow users benchmark multiple ops in a single run. The ops can be split by commas, --op fp8_gemm,addmm

Example output:

% python run_benchmark.py triton --op fp8_gemm,addmm --num-inputs 1
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00,  3.12s/it]
             x_val    torch_fp8_gemm-gbps    torch_fp8_gemm-gbps    torch_fp8_gemm-latency    torch_fp8_gemm-tflops    triton_fp8_gemm-gbps    triton_fp8_gemm-gbps    triton_fp8_gemm-latency    triton_fp8_gemm-tflops
------------------  ---------------------  ---------------------  ------------------------  -----------------------  ----------------------  ----------------------  -------------------------  ------------------------
(1024, 1024, 1024)                462.202                462.202                0.00907462                  236.647                  630.43                  630.43                 0.00665309                    322.78
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.90s/it]
         (M, N, K)    aten_addmm-best_config    aten_addmm-gbps    aten_addmm-tflops                                                                                       triton_addmm-best_config    triton_addmm-gbps    triton_addmm-tflops    pt2_triton_matmul-best_config    pt2_triton_matmul-gbps    pt2_triton_matmul-tflops
------------------  ------------------------  -----------------  -------------------  -------------------------------------------------------------------------------------------------------------  -------------------  ---------------------  -------------------------------  ------------------------  --------------------------
(20120, 512, 1536)                                      818.112              247.544  {'BLOCK_M': 128, 'BLOCK_N': 256, 'BLOCK_K': 64, 'GROUP_M': 8, 'num_warps': 8, 'num_ctas': 1, 'num_stages': 3}              911.569                275.823                                                    889.125                     269.031

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao merged this pull request in a1f4b2e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants