Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export to AOTI using cuda doesn't work using WSL #1293

Open
byjlw opened this issue Oct 10, 2024 · 7 comments
Open

export to AOTI using cuda doesn't work using WSL #1293

byjlw opened this issue Oct 10, 2024 · 7 comments
Labels
Compile / AOTI Issues related to AOT Inductor and torch compile

Comments

@byjlw
Copy link
Contributor

byjlw commented Oct 10, 2024

🐛 Describe the bug

python3 torchchat.py export llama3.1 --output-dso-path exportedModels/llama3.1.so

Using device=cuda
Setting max_seq_length to 300 for DSO export.
Loading model...
Time to load model: 2.74 seconds
-----------------------------------------------------------
Exporting model using AOT Inductor to /home/warden/source/torchchat/exportedModels/llama3.1.so
W1010 15:36:44.314000 6252 .venv/lib/python3.11/site-packages/torch/_export/__init__.py:225] +============================+
W1010 15:36:44.314000 6252 .venv/lib/python3.11/site-packages/torch/_export/__init__.py:226] |     !!!   WARNING   !!!    |
W1010 15:36:44.314000 6252 .venv/lib/python3.11/site-packages/torch/_export/__init__.py:227] +============================+
W1010 15:36:44.314000 6252 .venv/lib/python3.11/site-packages/torch/_export/__init__.py:228] torch._export.aot_compile() is being deprecated, please switch to directly calling torch._inductor.aoti_compile_and_package(torch.export.export()) instead.
/tmp/tmpie2hawx7/main.c:5:10: fatal error: Python.h: No such file or directory
    5 | #include <Python.h>
      |          ^~~~~~~~~~
compilation terminated.
Traceback (most recent call last):
  File "/home/warden/source/torchchat/torchchat.py", line 97, in <module>
    export_main(args)
  File "/home/warden/source/torchchat/torchchat/export.py", line 422, in main
    export_for_server(
  File "/home/warden/source/torchchat/torchchat/export.py", line 68, in export_for_server
    so = torch._export.aot_compile(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_export/__init__.py", line 303, in aot_compile
    so_path = torch._inductor.aot_compile(gm, args, kwargs, options=options)  # type: ignore[arg-type]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/__init__.py", line 204, in aot_compile
    return compile_fx_aot(
           ^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1150, in compile_fx_aot
    compiled_lib_path = compile_fx(
                        ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1297, in compile_fx
    return compile_fx(
           ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1329, in compile_fx
    return compile_fx(
           ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1551, in compile_fx
    return inference_compiler(unlifted_gm, example_inputs_)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1384, in fw_compiler_base
    return _fw_compiler_base(model, example_inputs, is_inference)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1455, in _fw_compiler_base
    return inner_compile(
           ^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 465, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/repro/after_aot.py", line 85, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 660, in _compile_fx_inner
    compiled_graph = codegen_and_compile(
                     ^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 560, in codegen_and_compile
    compiled_graph = fx_codegen_and_compile(gm, example_inputs, **fx_kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 872, in fx_codegen_and_compile
    compiled_fn = graph.compile_to_fn()
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1933, in compile_to_fn
    code, linemap = self.codegen_with_cpp_wrapper()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1725, in codegen_with_cpp_wrapper
    compiled = self.compile_to_module().call
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1877, in compile_to_module
    return self._compile_to_module()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1883, in _compile_to_module
    self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
                                                             ^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1822, in codegen
    self.scheduler.codegen()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/scheduler.py", line 3423, in codegen
    return self._codegen()
           ^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/scheduler.py", line 3501, in _codegen
    self.get_backend(device).codegen_node(node)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/cuda_combined_scheduling.py", line 80, in codegen_node
    return self._triton_scheduling.codegen_node(node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/simd.py", line 1194, in codegen_node
    return self.codegen_node_schedule(node_schedule, buf_accesses, numel, rnumel)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/simd.py", line 1403, in codegen_node_schedule
    src_code = kernel.codegen_kernel()
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/triton.py", line 2760, in codegen_kernel
    **self.inductor_meta_common(),
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/triton.py", line 2626, in inductor_meta_common
    "backend_hash": torch.utils._triton.triton_hash_with_backend(),
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_triton.py", line 65, in triton_hash_with_backend
    backend = triton_backend()
              ^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_triton.py", line 57, in triton_backend
    target = driver.active.get_current_target()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 23, in __getattr__
    self._initialize_obj()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
    self._obj = self._init_fn()
                ^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 9, in _create_driver
    return actives[0]()
           ^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 371, in __init__
    self.utils = CudaUtils()  # TODO: make static
                 ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 80, in __init__
    mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 57, in compile_module_from_src
    so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/build.py", line 48, in _build
    ret = subprocess.check_call(cc_cmd)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpie2hawx7/main.c', '-O3', '-shared', '-fPIC', '-o', '/tmp/tmpie2hawx7/cuda_utils.cpython-311-x86_64-linux-gnu.so', '-lcuda', '-L/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/lib', '-L/usr/lib/wsl/lib', '-I/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/include', '-I/tmp/tmpie2hawx7', '-I/usr/include/python3.11']' returned non-zero exit status 1.

Versions

Operating System Information
Linux Furiosa 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

Python Version
Python 3.11.10

PIP Version
pip 24.0 from /home/warden/source/torchchat/.venv/lib/python3.11/site-packages/pip (python 3.11)

Installed Packages
absl-py==2.1.0
accelerate==1.0.0
aiohappyeyeballs==2.4.3
aiohttp==3.10.9
aiosignal==1.3.1
altair==5.4.1
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyio==4.6.0
attrs==24.2.0
blinker==1.8.2
blobfile==3.0.0
cachetools==5.5.0
certifi==2024.8.30
chardet==5.2.0
charset-normalizer==3.4.0
click==8.1.7
cmake==3.30.4
colorama==0.4.6
DataProperty==1.0.1
datasets==3.0.1
dill==0.3.8
distro==1.9.0
evaluate==0.4.3
filelock==3.16.1
Flask==3.0.3
frozenlist==1.4.1
fsspec==2024.6.1
gguf==0.10.0
gitdb==4.0.11
GitPython==3.1.43
h11==0.14.0
httpcore==1.0.6
httpx==0.27.2
huggingface-hub==0.25.2
idna==3.10
itsdangerous==2.2.0
Jinja2==3.1.4
jiter==0.6.1
joblib==1.4.2
jsonlines==4.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
lm_eval==0.4.2
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==3.0.1
mbstrdecoder==1.1.3
mdurl==0.1.2
more-itertools==10.5.0
mpmath==1.3.0
multidict==6.1.0
multiprocess==0.70.16
narwhals==1.9.2
networkx==3.4
ninja==1.11.1.1
nltk==3.9.1
numexpr==2.10.1
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.6.77
nvidia-nvtx-cu12==12.1.105
omegaconf==2.3.0
openai==1.51.2
packaging==24.1
pandas==2.2.3
pathvalidate==3.2.1
peft==0.13.1
pillow==10.4.0
portalocker==2.10.1
propcache==0.2.0
protobuf==5.28.2
psutil==6.0.0
pyarrow==17.0.0
pybind11==2.13.6
pycryptodomex==3.21.0
pydantic==2.9.2
pydantic_core==2.23.4
pydeck==0.9.1
Pygments==2.18.0
pytablewriter==1.2.0
python-dateutil==2.9.0.post0
pytorch-triton==3.1.0+cf34004b8a
pytz==2024.2
PyYAML==6.0.2
referencing==0.35.1
regex==2024.9.11
requests==2.32.3
rich==13.9.2
rouge-score==0.1.2
rpds-py==0.20.0
sacrebleu==2.4.3
safetensors==0.4.5
scikit-learn==1.5.2
scipy==1.14.1
sentencepiece==0.2.0
six==1.16.0
smmap==5.0.1
snakeviz==2.2.0
sniffio==1.3.1
sqlitedict==2.1.0
streamlit==1.39.0
sympy==1.13.1
tabledata==1.3.3
tabulate==0.9.0
tcolorpy==0.1.6
tenacity==9.0.0
threadpoolctl==3.5.0
tiktoken==0.8.0
tokenizers==0.20.1
toml==0.10.2
torch==2.6.0.dev20241002+cu121
torchao==0.5.0
torchtune==0.3.0.dev20240928+cu121
torchvision==0.20.0.dev20241002+cu121
tornado==6.4.1
tqdm==4.66.5
tqdm-multiprocess==0.0.11
transformers==4.45.2
typepy==1.3.2
typing_extensions==4.12.2
tzdata==2024.2
urllib3==2.2.3
watchdog==5.0.3
Werkzeug==3.0.4
word2number==1.1
xxhash==3.5.0
yarl==1.14.0
zstandard==0.23.0
zstd==1.5.5.1

PyTorch Version
2.6.0.dev20241002+cu121

@Jack-Khuu Jack-Khuu added the Compile / AOTI Issues related to AOT Inductor and torch compile label Oct 10, 2024
@Jack-Khuu
Copy link
Contributor

Jack-Khuu commented Oct 11, 2024

Hmm i don't seem to be able to repro this, do you only see it on a WSL machine or do you see it from a pure linux machine as well?

i.e. I can export using the same command on a clean install

@gmagogsfm
Copy link

@pianpwk

@byjlw
Copy link
Contributor Author

byjlw commented Oct 11, 2024

I haven't tried pure linux since finding this issue on wsl.

@byjlw
Copy link
Contributor Author

byjlw commented Oct 11, 2024

Compile also throws a similar error


python3 torchchat.py generate llama3.1 --prompt "tell me a story about a bear
Using device=cuda NVIDIA GeForce RTX 4090
Time to load model: 8.90 seconds
/tmp/tmpcok5wf6p/main.c:5:10: fatal error: Python.h: No such file or directory
    5 | #include <Python.h>
      |          ^~~~~~~~~~
compilation terminated.
Traceback (most recent call last):
  File "/home/warden/source/torchchat/torchchat.py", line 88, in <module>
    generate_main(args)
  File "/home/warden/source/torchchat/torchchat/generate.py", line 1221, in main
    for _ in gen.chat(generator_args):
  File "/home/warden/source/torchchat/torchchat/generate.py", line 1112, in chat
    for token_tensor, metrics in generator_func:
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 57, in generator_context
    response = gen.send(request)
               ^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/torchchat/generate.py", line 697, in generate
    for generated_token, _ in self.decode_n_tokens(
  File "/home/warden/source/torchchat/torchchat/generate.py", line 456, in decode_n_tokens
    next_token, next_prob = self.decode_one_token(
                            ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 465, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 1333, in __call__
    return self._torchdynamo_orig_callable(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 528, in __call__
    return _compile(
           ^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 948, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 679, in compile_inner
    return _compile_inner(code, one_graph, hooks, transform)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_utils_internal.py", line 87, in wrapper_function
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 712, in _compile_inner
    out_code = transform_code_object(code, transform)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1337, in transform_code_object
    transformations(instructions, code_options)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 221, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 641, in transform
    tracer.run()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2766, in run
    super().run()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 973, in run
    while self.step():
          ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 885, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2957, in RETURN_VALUE
    self._return(inst)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2942, in _return
    self.output.compile_subgraph(
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1143, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1370, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1417, in call_user_compiler
    return self._call_user_compiler(gm)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1466, in _call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1447, in _call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 130, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 130, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/__init__.py", line 2241, in __call__
    return compile_fx(model_, inputs_, config_patches=self.config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1297, in compile_fx
    return compile_fx(
           ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1558, in compile_fx
    return aot_autograd(
           ^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/backends/common.py", line 72, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 1077, in aot_module_simplified
    compiled_fn = dispatch_and_compile()
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 1062, in dispatch_and_compile
    compiled_fn, _ = create_aot_dispatcher_function(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 523, in create_aot_dispatcher_function
    return _create_aot_dispatcher_function(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 761, in _create_aot_dispatcher_function
    compiled_fn, fw_metadata = compiler_fn(
                               ^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 180, in aot_dispatch_base
    compiled_fw = compiler(fw_module, updated_flat_args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1384, in fw_compiler_base
    return _fw_compiler_base(model, example_inputs, is_inference)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1455, in _fw_compiler_base
    return inner_compile(
           ^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 465, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_dynamo/repro/after_aot.py", line 85, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 650, in _compile_fx_inner
    compiled_graph = FxGraphCache.load(
                     ^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codecache.py", line 1427, in load
    compiled_graph = compile_fx_fn(
                     ^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 560, in codegen_and_compile
    compiled_graph = fx_codegen_and_compile(gm, example_inputs, **fx_kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 872, in fx_codegen_and_compile
    compiled_fn = graph.compile_to_fn()
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1951, in compile_to_fn
    return self.compile_to_module().call
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1877, in compile_to_module
    return self._compile_to_module()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1883, in _compile_to_module
    self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
                                                             ^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/graph.py", line 1822, in codegen
    self.scheduler.codegen()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/scheduler.py", line 3423, in codegen
    return self._codegen()
           ^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/scheduler.py", line 3501, in _codegen
    self.get_backend(device).codegen_node(node)
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/cuda_combined_scheduling.py", line 80, in codegen_node
    return self._triton_scheduling.codegen_node(node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/simd.py", line 1194, in codegen_node
    return self.codegen_node_schedule(node_schedule, buf_accesses, numel, rnumel)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/simd.py", line 1403, in codegen_node_schedule
    src_code = kernel.codegen_kernel()
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/triton.py", line 2760, in codegen_kernel
    **self.inductor_meta_common(),
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/_inductor/codegen/triton.py", line 2626, in inductor_meta_common
    "backend_hash": torch.utils._triton.triton_hash_with_backend(),
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_triton.py", line 65, in triton_hash_with_backend
    backend = triton_backend()
              ^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/torch/utils/_triton.py", line 57, in triton_backend
    target = driver.active.get_current_target()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 23, in __getattr__
    self._initialize_obj()
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
    self._obj = self._init_fn()
                ^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/driver.py", line 9, in _create_driver
    return actives[0]()
           ^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 371, in __init__
    self.utils = CudaUtils()  # TODO: make static
                 ^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 80, in __init__
    mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/driver.py", line 57, in compile_module_from_src
    so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/runtime/build.py", line 48, in _build
    ret = subprocess.check_call(cc_cmd)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpcok5wf6p/main.c', '-O3', '-shared', '-fPIC', '-o', '/tmp/tmpcok5wf6p/cuda_utils.cpython-311-x86_64-linux-gnu.so', '-lcuda', '-L/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/lib', '-L/usr/lib/wsl/lib', '-I/home/warden/source/torchchat/.venv/lib/python3.11/site-packages/triton/backends/nvidia/include', '-I/tmp/tmpcok5wf6p', '-I/usr/include/python3.11']' returned non-zero exit status 1.

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

@byjlw byjlw changed the title export to AOTI using cuda doesn't work export to AOTI using cuda doesn't work using WSL Oct 11, 2024
@desertfire
Copy link
Contributor

This looks like a WSL lib path issue. The compilation cmd is looking for Python.h in -I/usr/include/python3.11, which is probably wrong. cc @xuhancn who has more experience on Windows build.

@byjlw
Copy link
Contributor Author

byjlw commented Oct 15, 2024

This looks like a WSL lib path issue. The compilation cmd is looking for Python.h in -I/usr/include/python3.11, which is probably wrong. cc @xuhancn who has more experience on Windows build.

Yeah the path is not right. Any idea why it's looking there?

@desertfire
Copy link
Contributor

  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)

This suggests you are not using your conda or pip installed python. Can you run which python to double check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compile / AOTI Issues related to AOT Inductor and torch compile
Projects
None yet
Development

No branches or pull requests

4 participants