Add Intel XPU device support to generate and serve #1361

jenniew · 2024-11-09T06:27:45Z

Add XPU device support exclude distributed mode, workflow, documentation.

pytorch-bot · 2024-11-09T06:27:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1361

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

GLIBC not found in Nova workflows

✅ No Failures

As of commit 4d16351 with merge base 2cf1a17 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-11-09T06:27:50Z

Hi @jenniew!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

…device

mikekgfb · 2024-11-09T18:13:57Z

Is there a way to run at least a few simple tests on an xpu to avoid inadvertent breakage?

jenniew · 2024-11-12T07:26:10Z

Is there a way to run at least a few simple tests on an xpu to avoid inadvertent breakage?

For generate, run a simple test by python3 torchchat.py generate llama3.1 --prompt "write me a story about a boy and his bear --device xpu
For serve, run a simple test by python3 torchchat.py server llama3.1 --device xpu

Jack-Khuu · 2024-11-12T09:18:30Z

install/install_requirements.sh

-PYTORCH_NIGHTLY_VERSION=dev20241002
+if [[ -x "$(command -v xpu-smi)" ]];
+then
+  PYTORCH_NIGHTLY_VERSION=dev20241001


Why does xpu need an older PYTORCH_NIGHTLY?

when install torch==2.6.0.dev20241002 and torchvision==0.20.0.dev20241002+xpu, it will get error:
ERROR: Cannot install torch==2.6.0.dev20241002 and torchvision==0.20.0.dev20241002+xpu because these package versions have conflicting dependencies.

The conflict is caused by:
The user requested torch==2.6.0.dev20241002
torchvision 0.20.0.dev20241002+xpu depends on torch==2.6.0.dev20241001

So for xpu, I changed the torch nightly version to dev20241001

Let me see if I can get you a fresher version on XPU, the torch/vision discrepancy shouldn't be a normal thing

Jack-Khuu · 2024-11-12T09:19:06Z

install/install_requirements.sh

+  REQUIREMENTS_TO_INSTALL=(
+    torch=="2.6.0.${PYTORCH_NIGHTLY_VERSION}"
+    torchvision=="0.20.0.${VISION_NIGHTLY_VERSION}"
+    torchtune=="0.3.1"


Context on the varying tune version?

On xpu nightly URL, it does not have nightly version of torchtune, so just install 0.3.1 release for xpu environment.

Hmm we should add support for nightly, let me ping some torchtune folk

cc: @ebsmothers

Jack-Khuu · 2024-11-12T09:26:07Z

Welcome to torchchat and thanks for adding @jenniew!!

Super stoked to see that it didn't require much lift to get XPU set up. Added a few questions on the versioning difference.
I'll try to get some folks to test and see if we can get some recurring jobs set up.

What device did you test on btw?

Tagging a few folk who I'm trying to help taking a larger role

Jack-Khuu · 2024-11-12T22:18:16Z

@jenniew Also do you mind filling out the CLA? It'll allow you to contribute to Meta repos

jenniew · 2024-11-13T20:07:10Z

@jenniew Also do you mind filling out the CLA? It'll allow you to contribute to Meta repos

Yes, I just signed the CLA

…o xpu_device

jenniew · 2024-11-13T23:09:00Z

Welcome to torchchat and thanks for adding @jenniew!!

Super stoked to see that it didn't require much lift to get XPU set up. Added a few questions on the versioning difference. I'll try to get some folks to test and see if we can get some recurring jobs set up.

What device did you test on btw?

Tagging a few folk who I'm trying to help taking a larger role

I tested on Intel Data Center GPU Max 1100.

Jack-Khuu · 2024-11-15T20:01:48Z

Just an update:

I'm looking into getting a PT pin bumped Multi Pin Bumps across PT/AO/tune/ET #1367, which should allow XPU to use a fresher version
https://github.com/pytorch/torchtune just release 0.4.0, which I'm asking someone to bump torchchat to so XPU can have an updated version over there as well

mikekgfb · 2024-12-10T05:22:15Z

Would be nice to run as a test as well. Could be as easy as enabling and adding a runner for xpu to test-readme-pr.yml if it is available. Alternatively, may require a copy of that file because the test spec has several target-runner related specs, with updates to all fields for values representative of xpu environment:

      runner: linux.g5.4xlarge.nvidia.gpu
      gpu-arch-type: cuda
      gpu-arch-version: "12.1"

jenniew added 2 commits November 6, 2024 22:48

add xpu

778efd6

add xpu device

7c4e42b

jenniew added 2 commits November 9, 2024 06:29

update

6ed3cda

Merge branch 'main' of https://github.com/pytorch/torchchat into xpu_…

6e73400

…device

Jack-Khuu requested review from vmpuri, Jack-Khuu and Gasoonjia November 12, 2024 09:19

Jack-Khuu reviewed Nov 12, 2024

View reviewed changes

Merge branch 'main' into xpu_device

4735bff

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 13, 2024

jenniew added 2 commits November 13, 2024 22:58

merge

6ef7cd5

Merge branch 'xpu_device' of https://github.com/jenniew/torchchat int…

4d16351

…o xpu_device

Jack-Khuu mentioned this pull request Nov 13, 2024

Can we get XPU Nightlies? pytorch/torchtune#2005

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Intel XPU device support to generate and serve #1361

Add Intel XPU device support to generate and serve #1361

jenniew commented Nov 9, 2024

pytorch-bot bot commented Nov 9, 2024 •

edited

Loading

facebook-github-bot commented Nov 9, 2024

mikekgfb commented Nov 9, 2024

jenniew commented Nov 12, 2024

Jack-Khuu Nov 12, 2024

jenniew Nov 13, 2024

Jack-Khuu Nov 13, 2024

Jack-Khuu Nov 12, 2024

jenniew Nov 13, 2024

Jack-Khuu Nov 13, 2024 •

edited

Loading

Jack-Khuu commented Nov 12, 2024

Jack-Khuu commented Nov 12, 2024

jenniew commented Nov 13, 2024

jenniew commented Nov 13, 2024

Jack-Khuu commented Nov 15, 2024 •

edited

Loading

mikekgfb commented Dec 10, 2024 •

edited

Loading

Add Intel XPU device support to generate and serve #1361

Are you sure you want to change the base?

Add Intel XPU device support to generate and serve #1361

Conversation

jenniew commented Nov 9, 2024

pytorch-bot bot commented Nov 9, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1361

❗ 1 Active SEVs

✅ No Failures

facebook-github-bot commented Nov 9, 2024

Action Required

Process

mikekgfb commented Nov 9, 2024

jenniew commented Nov 12, 2024

Jack-Khuu Nov 12, 2024

Choose a reason for hiding this comment

jenniew Nov 13, 2024

Choose a reason for hiding this comment

Jack-Khuu Nov 13, 2024

Choose a reason for hiding this comment

Jack-Khuu Nov 12, 2024

Choose a reason for hiding this comment

jenniew Nov 13, 2024

Choose a reason for hiding this comment

Jack-Khuu Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Jack-Khuu commented Nov 12, 2024

Jack-Khuu commented Nov 12, 2024

jenniew commented Nov 13, 2024

jenniew commented Nov 13, 2024

Jack-Khuu commented Nov 15, 2024 • edited Loading

mikekgfb commented Dec 10, 2024 • edited Loading

pytorch-bot bot commented Nov 9, 2024 •

edited

Loading

Jack-Khuu Nov 13, 2024 •

edited

Loading

Jack-Khuu commented Nov 15, 2024 •

edited

Loading

mikekgfb commented Dec 10, 2024 •

edited

Loading