Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT LAND TEST BUTTERFLY BOT #1301

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jcaip
Copy link
Contributor

@jcaip jcaip commented Nov 18, 2024

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub

Reviewed By: Dustinpro

Differential Revision: D66103735

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub

Reviewed By: Dustinpro

Differential Revision: D66103735
Copy link

pytorch-bot bot commented Nov 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1301

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

❌ 11 New Failures

As of commit 16b16b8 with merge base d4ca98f (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 18, 2024
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D66103735

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
* add pp_dim, distributed, num_gpus, num_nodes as cmd line args

* add tp_dim

* add elastic_launch

* working, can now launch from cli

* Remove numpy < 2.0 pin to align with pytorch (pytorch#1301)

Fix pytorch#1296

Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

* Update torchtune pin to 0.4.0-dev20241010 (pytorch#1300)

Co-authored-by: vmpuri <[email protected]>

* Unbreak gguf util CI job by fixing numpy version (pytorch#1307)

Setting numpy version to be the range required by gguf: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/pyproject.toml

* Remove apparently-unused import torchvision in model.py (pytorch#1305)

Co-authored-by: vmpuri <[email protected]>

* remove global var for tokenizer type + patch tokenizer to allow list of sequences

* make pp tp visible in interface

* Add llama 3.1 to dist_run.py

* [WIP] Move dist inf into its own generator

* Add initial generator interface to dist inference

* Added generate method and placeholder scheduler

* use prompt parameter for dist generation

* Enforce tp>=2

* Build tokenizer from TokenizerArgs

* Disable torchchat format + constrain possible models for distributed

* disable calling dist_run.py directly for now

* Restore original dist_run.py for now

* disable _maybe_parallelize_model again

* Reenable arg.model_name in dist_run.py

* Use singleton logger instead of print in generate

* Address PR comments; try/expect in launch_dist_inference; added comments

---------

Co-authored-by: lessw2020 <[email protected]>
Co-authored-by: Mengwei Liu <[email protected]>
Co-authored-by: vmpuri <[email protected]>
Co-authored-by: vmpuri <[email protected]>
Co-authored-by: Scott Wolchok <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants