Skip to content

Commit

Permalink
Add multiple ops support for --op argument (#2490)
Browse files Browse the repository at this point in the history
Summary:
Allow users benchmark multiple ops in a single run. The ops can be split by commas, `--op fp8_gemm,addmm`

Example output:
```
% python run_benchmark.py triton --op fp8_gemm,addmm --num-inputs 1
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00,  3.12s/it]
             x_val    torch_fp8_gemm-gbps    torch_fp8_gemm-gbps    torch_fp8_gemm-latency    torch_fp8_gemm-tflops    triton_fp8_gemm-gbps    triton_fp8_gemm-gbps    triton_fp8_gemm-latency    triton_fp8_gemm-tflops
------------------  ---------------------  ---------------------  ------------------------  -----------------------  ----------------------  ----------------------  -------------------------  ------------------------
(1024, 1024, 1024)                462.202                462.202                0.00907462                  236.647                  630.43                  630.43                 0.00665309                    322.78
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.90s/it]
         (M, N, K)    aten_addmm-best_config    aten_addmm-gbps    aten_addmm-tflops                                                                                       triton_addmm-best_config    triton_addmm-gbps    triton_addmm-tflops    pt2_triton_matmul-best_config    pt2_triton_matmul-gbps    pt2_triton_matmul-tflops
------------------  ------------------------  -----------------  -------------------  -------------------------------------------------------------------------------------------------------------  -------------------  ---------------------  -------------------------------  ------------------------  --------------------------
(20120, 512, 1536)                                      818.112              247.544  {'BLOCK_M': 128, 'BLOCK_N': 256, 'BLOCK_K': 64, 'GROUP_M': 8, 'num_warps': 8, 'num_ctas': 1, 'num_stages': 3}              911.569                275.823                                                    889.125                     269.031
```

Pull Request resolved: #2490

Reviewed By: xuzhao9

Differential Revision: D63862548

Pulled By: FindHao

fbshipit-source-id: 9d4afa6051d4191bc2e3288f59e2820627647b91
  • Loading branch information
FindHao authored and facebook-github-bot committed Oct 4, 2024
1 parent 12820bc commit a1f4b2e
Showing 1 changed file with 13 additions and 2 deletions.
15 changes: 13 additions & 2 deletions userbenchmark/triton/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,12 @@

def get_parser(args=None):
parser = argparse.ArgumentParser(allow_abbrev=False)
parser.add_argument("--op", type=str, required=False, help="Operator to benchmark.")
parser.add_argument(
"--op",
type=str,
required=False,
help="Operators to benchmark. Split with comma if multiple.",
)
parser.add_argument(
"--mode",
choices=["fwd", "bwd", "fwd_bwd", "fwd_no_grad"],
Expand Down Expand Up @@ -188,5 +193,11 @@ def run(args: List[str] = []):
run_ci()
return

if args.op:
ops = args.op.split(",")
else:
ops = []
with gpu_lockdown(args.gpu_lockdown):
_run(args, extra_args)
for op in ops:
args.op = op
_run(args, extra_args)

0 comments on commit a1f4b2e

Please sign in to comment.