2.6 backport PR request list #8455

tengyifei · 2024-12-05T19:05:56Z

This is a tracker for backport/cherry-picks into 2.6. For any PRs you want to backport to 2.6, please reply with following:

Original PR link (this PR should merge into master)
Reason to backport
2.6 backport PR link (a separate PR should be created, and that PR should merge into r2.6)

This process is similar to the backport request thread in 2.5 release #7977

Please note the criterion for cherrypicking into 2.6: #7203

The text was updated successfully, but these errors were encountered:

tengyifei · 2024-12-11T19:09:14Z

@lsy323 libtpu pin update to 0.0.6: #8480.
Reason: pick stable libtpu for Trillium
Backport PR link: manually pushed to r2.6 since there's no delta between these branches

yaochengji · 2024-12-18T01:44:12Z

Fix a DDP graph capture issue #8489
Reason: The DDP result before this patch is wrong
Backport PR link: #8500

mcuiaws · 2024-12-18T17:35:24Z

Original PR: Compute and hash buffer_donor_indices for step marker #8467
Reason: Fixes tensor corruption issue
Backport PR: #8503

mcuiaws · 2024-12-18T17:41:14Z

Original PR: xm.save() should not set sync_xla_data=True when sync'ing. #8484
Reason: Fixes tensor corruption issues, easily reproducible by running huggingface tutorials.
Backport PR: #8504

mcuiaws · 2024-12-18T18:24:37Z

Original PR: Add xm.xla_device_kind() to return XLA device kind string. #8493
Reason: Requested by Neuron customers. Feature available in JAX but not PyTorch/XLA. Should be very low risk.
Backport PR: #8506

jeffhataws · 2024-12-20T00:04:05Z

Original PR: When modifying IR node, make sure to not lose the read_only bit #8505
Reason: Fixed a bug where 0-dimensional tensors result in aliasing errors
Backport PR: TBD

mcuiaws · 2024-12-20T18:31:20Z

Original PR: When modifying IR node, make sure to not lose the read_only bit #8505 Reason: Fixed a bug where 0-dimensional tensors result in aliasing errors Backport PR: TBD

Backport PR: #8508

avizon-aws · 2024-12-20T23:49:13Z

Cherry-Pick PR for softmax autocast:
Reason: This PR fixes the precision issues related to softmax being done in BF16 as it was not a part of the autocast policy which leads to convergence issues.
Original PR:#8509
Cherry-pick PR: #8511

tengyifei added the backport_2.6 label Dec 5, 2024

jeffhataws mentioned this issue Dec 18, 2024

Fix a DDP graph capture issue #8489

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.6 backport PR request list #8455

2.6 backport PR request list #8455

tengyifei commented Dec 5, 2024 •

edited

Loading

tengyifei commented Dec 11, 2024 •

edited

Loading

yaochengji commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

jeffhataws commented Dec 20, 2024

mcuiaws commented Dec 20, 2024

avizon-aws commented Dec 20, 2024 •

edited

Loading

2.6 backport PR request list #8455

2.6 backport PR request list #8455

Comments

tengyifei commented Dec 5, 2024 • edited Loading

tengyifei commented Dec 11, 2024 • edited Loading

yaochengji commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

mcuiaws commented Dec 18, 2024

jeffhataws commented Dec 20, 2024

mcuiaws commented Dec 20, 2024

avizon-aws commented Dec 20, 2024 • edited Loading

tengyifei commented Dec 5, 2024 •

edited

Loading

tengyifei commented Dec 11, 2024 •

edited

Loading

avizon-aws commented Dec 20, 2024 •

edited

Loading