DO NOT LAND TEST BUTTERFLY BOT #1301

jcaip · 2024-11-18T17:46:08Z

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub

Reviewed By: Dustinpro

Differential Revision: D66103735

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub Reviewed By: Dustinpro Differential Revision: D66103735

pytorch-bot · 2024-11-18T17:46:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1301

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

❌ 11 New Failures

As of commit 16b16b8 with merge base d4ca98f ():

NEW FAILURES - The following jobs have failed:

Build Docs / build_docs (3.11) (gh)
Process completed with exit code 2.
PR Label Check / Check PR Labels (gh)
##[error]This PR requires at least one label starting with 'topic:'. Available topics can be found at: https://github.com/pytorch/ao/labels?q=topic
Run Float8 Tests / test (SM-89, linux.g6.4xlarge.experimental.nvidia.gpu, --pre torch --index-url https://download.p... / linux-job (gh)
RuntimeError: Command docker exec -t 73a38cbb3186ca9823ee44a972ea1452fd1ebecfbc7bebad7761937c2b566f02 /exec failed with exit code 2
Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t b420efa6fbbcf607d96191f204f6ffd1302153bafc00a9f4860a7067d0bc4a65 /exec failed with exit code 2
Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 8f383cd0c65a1d7aba183d4dc54468c04cb4f7f7a197c9641c8e62ac94099492 /exec failed with exit code 2
Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 8896ecd60f270feb9a14037e95d14bade571a50b512ede9f42d34df83f1d60c9 /exec failed with exit code 2
Run Regression Tests / test (CPU Nightly, linux.4xlarge, --pre torch==2.6.0.dev20241101 --index-url https://download.pyt... / linux-job (gh)
RuntimeError: Command docker exec -t 5cd39e9cb2de849e047bf5ab67f40ddc507f9b852f05d0fe12a251772336483e /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 071df103cefafa501c0842254d0165ebf466ca1eff33f88af26c8efdc9ecad43 /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t d6a3a63050b6b7831bd464744b069290d0f3297526a5613f46f249e587c9696b /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
RuntimeError: Command docker exec -t dd2acaebe007605532d3ad99156e00e7ad3621eb3b57c530138d6a485e6f4d42 /exec failed with exit code 2
Run Regression Tests / test (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch==2.6.0.dev20241101 --index-url http... / linux-job (gh)
RuntimeError: Command docker exec -t 96a52e2df7ae6b019790f432e1af2ecf46d5d8b97366f84b45dd39eb499ae5da /exec failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-11-18T17:46:39Z

This pull request was exported from Phabricator. Differential Revision: D66103735

Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

* add pp_dim, distributed, num_gpus, num_nodes as cmd line args * add tp_dim * add elastic_launch * working, can now launch from cli * Remove numpy < 2.0 pin to align with pytorch (pytorch#1301) Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5 * Update torchtune pin to 0.4.0-dev20241010 (pytorch#1300) Co-authored-by: vmpuri <[email protected]> * Unbreak gguf util CI job by fixing numpy version (pytorch#1307) Setting numpy version to be the range required by gguf: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/pyproject.toml * Remove apparently-unused import torchvision in model.py (pytorch#1305) Co-authored-by: vmpuri <[email protected]> * remove global var for tokenizer type + patch tokenizer to allow list of sequences * make pp tp visible in interface * Add llama 3.1 to dist_run.py * [WIP] Move dist inf into its own generator * Add initial generator interface to dist inference * Added generate method and placeholder scheduler * use prompt parameter for dist generation * Enforce tp>=2 * Build tokenizer from TokenizerArgs * Disable torchchat format + constrain possible models for distributed * disable calling dist_run.py directly for now * Restore original dist_run.py for now * disable _maybe_parallelize_model again * Reenable arg.model_name in dist_run.py * Use singleton logger instead of print in generate * Address PR comments; try/expect in launch_dist_inference; added comments --------- Co-authored-by: lessw2020 <[email protected]> Co-authored-by: Mengwei Liu <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: Scott Wolchok <[email protected]>

DO NOT LAND TEST BUTTERFLY BOT

16b16b8

Summary: Created from CodeHub with https://fburl.com/edit-in-codehub Reviewed By: Dustinpro Differential Revision: D66103735

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 18, 2024

facebook-github-bot added the fb-exported label Nov 18, 2024

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Remove numpy < 2.0 pin to align with pytorch (pytorch#1301)

dd9747f

Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO NOT LAND TEST BUTTERFLY BOT #1301

DO NOT LAND TEST BUTTERFLY BOT #1301

jcaip commented Nov 18, 2024

pytorch-bot bot commented Nov 18, 2024 •

edited

Loading

facebook-github-bot commented Nov 18, 2024

DO NOT LAND TEST BUTTERFLY BOT #1301

Are you sure you want to change the base?

DO NOT LAND TEST BUTTERFLY BOT #1301

Conversation

jcaip commented Nov 18, 2024

pytorch-bot bot commented Nov 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1301

❗ 2 Active SEVs

❌ 11 New Failures

facebook-github-bot commented Nov 18, 2024

pytorch-bot bot commented Nov 18, 2024 •

edited

Loading