-
Notifications
You must be signed in to change notification settings - Fork 229
Issues: pytorch/torchtitan
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[question]can't disable CP for specific (unsupported) SDPA op
context_parallel
enhancement
New feature or request
#757
opened Dec 20, 2024 by
FindDefinition
JobConfig does not support typing
enhancement
New feature or request
#753
opened Dec 18, 2024 by
greeneggsandyaml
using fsdp2 wrapper Flux(text to image) model , gradient is inconsistent with fsdp1
question
Further information is requested
#734
opened Dec 13, 2024 by
yanmj0601
Issue: Loss Discrepancy Between FSDP1 and FSDP2 with AdamW Optimizer
question
Further information is requested
#724
opened Dec 9, 2024 by
Teng-xu
Context parallelism understanding
context_parallel
question
Further information is requested
#723
opened Dec 9, 2024 by
jinsong-mao
First Shard Group Save and Load Checkpoint for HSDP
question
Further information is requested
#709
opened Nov 29, 2024 by
qsh-zh
[rfc] torchtitan release practices
release_blocking
Issues that are blocking the milestone / release completion
torch.compile(sync_float8_amax_and_scale_history) not working with triton latest main
bug
Something isn't working
#681
opened Nov 19, 2024 by
goldhuang
[Parallelism] Implement vocabulary parallelism
enhancement
New feature or request
#680
opened Nov 15, 2024 by
casper-hansen
Any suggestion for Llama-3.1-70b(128k seq len) deploy mesh with torchtian?
enhancement
New feature or request
question
Further information is requested
#678
opened Nov 15, 2024 by
medivh-xp
Very low wps with H200 Gpus
question
Further information is requested
#676
opened Nov 13, 2024 by
aniltrkkn
Questions about FSDP2 support and memory usage.
question
Further information is requested
#658
opened Oct 29, 2024 by
tangjiasheng
torch.distributed.breakpoint(rank=1) hangs because of --local-ranks-filter 0
documentation
Improvements or additions to documentation
#652
opened Oct 25, 2024 by
weifengpy
[Multimodal] Adding OBELICS DataLoader
enhancement
New feature or request
#650
opened Oct 24, 2024 by
TJ-Solergibert
[Config] Make FSDP New feature or request
reshard_after_forward: bool
configurable
enhancement
#644
opened Oct 22, 2024 by
awgu
What is the expected inference steps after I apply torchao in training?
question
Further information is requested
#638
opened Oct 21, 2024 by
goldhuang
add H100 in CI
better_engineering
Repo code quality improvements
integration test
Adding integration tests
#632
opened Oct 18, 2024 by
tianyu-l
create a note on torchtitan official release
documentation
Improvements or additions to documentation
release_blocking
Issues that are blocking the milestone / release completion
Non-DP runs default to float32 precision
enhancement
New feature or request
#630
opened Oct 18, 2024 by
carmocca
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.