Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial for AOTI Python runtime #2997

Merged
merged 26 commits into from
Aug 23, 2024
Merged

Conversation

agunapal
Copy link
Contributor

@agunapal agunapal commented Aug 12, 2024

Description

We have an AOT Inductor tutorial for showing inference on C++ runtime here

This tutorial shows how to run AOTI on Python runtime

  • Shows support for dynamic_shapes for batch dimension
  • Shows how to include torch.compile option like max-autotune mode
  • Reverts the docker image back to using devel image ( needed for AOT compile)

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

Copy link

pytorch-bot bot commented Aug 12, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2997

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 194388e with merge base 96b9c27 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@agunapal
Copy link
Contributor Author

Hi @svekars Do I ignore this https://github.com/pytorch/tutorials/actions/runs/10360994812/job/28680461319?pr=2997 or do I need to add some checks in the tutorial?

The failure is because the machine doesn't support triton

@svekars
Copy link
Contributor

svekars commented Aug 13, 2024

@agunapal you need to put it on a different worker similar to this

Copy link
Contributor

@svekars svekars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few editorial nits. Also, it feels a bit short for a full size intermediate tutorial. We should either add more or move to recipes. Also, we need to add entries either to index.rst or recipes_sourece/recipes_index.rst (depending on whether it's recipe or tutorial)

intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
#
# .. note::
#
# This API also supports :func:`torch.compile` options like `mode`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This API also supports :func:`torch.compile` options like `mode`
# This API also supports :func:`torch.compile` options like ``mode`` and other.

intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
@agunapal
Copy link
Contributor Author

Just a few editorial nits. Also, it feels a bit short for a full size intermediate tutorial. We should either add more or move to recipes. Also, we need to add entries either to index.rst or recipes_sourece/recipes_index.rst (depending on whether it's recipe or tutorial)

Sure, once the content is finalized and looks good, we can move it where you think its appropriate

# a shared library that can be run in a non-Python environment.
#
#
# In this tutorial, you will learn an end-to-end example of how to use AOTInductor for python runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will make the story more complete by explaining the "why" part here, e.g. eliminating recompilation at run time, max-autotune ahead of time, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Haven't mentioned eliminating recompilation, since the tutorial doesn't show that

example_inputs = (torch.randn(2, 3, 224, 224, device=device),)

# min=2 is not a bug and is explained in the 0/1 Specialization Problem
batch_dim = torch.export.Dim("batch", min=2, max=32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is ok to use min=1 here, but we can't feed in an example input with batch size 1.

Copy link
Contributor Author

@agunapal agunapal Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example with batch_size 1 is usually tried often, hence I set min=2

intermediate_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
agunapal and others added 2 commits August 16, 2024 14:24
Copy link
Contributor

@svekars svekars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double-check the formatting here. Also need to add to the recipe_index.rst. But otherwise, from the publishing perspective LGTM.

recipes_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
recipes_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
recipes_source/torch_export_aoti_python.py Outdated Show resolved Hide resolved
@agunapal
Copy link
Contributor Author

Please double-check the formatting here. Also need to add to the recipe_index.rst. But otherwise, from the publishing perspective LGTM.

@svekars I fixed the indentation of Pre-requisites. Its still not rendering correctly. Any suggestion?


######################################################################
# We see that there is a drastic speedup in first inference time using AOTInductor compared
# to ``torch.compile``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some example numbers to share here? So readers can get some rough idea without actually running the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the rendered html , the tutorial shows 2.92 ms vs 7000 ms. It might be good to collect this number over a range of models similar to how we show perf difference with compile vs eager.

@svekars svekars merged commit ea2dfc6 into pytorch:main Aug 23, 2024
18 checks passed
svekars added a commit that referenced this pull request Aug 23, 2024
* Tutorial for AOTI Python runtime
---------

Co-authored-by: Svetlana Karslioglu <[email protected]>
Co-authored-by: Angela Yi <[email protected]>
c-p-i-o pushed a commit that referenced this pull request Sep 6, 2024
* Tutorial for AOTI Python runtime
---------

Co-authored-by: Svetlana Karslioglu <[email protected]>
Co-authored-by: Angela Yi <[email protected]>
c-p-i-o pushed a commit that referenced this pull request Sep 6, 2024
* Tutorial for AOTI Python runtime
---------

Co-authored-by: Svetlana Karslioglu <[email protected]>
Co-authored-by: Angela Yi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants