You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Most frameworks for working with TPUs only utilize multiprocessing at most, such as Accelerate. Even PyTorch/XLA currently only recently supports FSDP, which is not equivalent to pipeline parallelism within DeepSpeed.
This is frustrating, as figuring out a way to enable multi-node SPMD or FSDP with XLA is challenging due to the lack of documentation and examples. Given DeepSpeed's popularity and its advanced features like pipeline parallelism, it is essential to add support for XLA.
There is strong interest in the DeepSpeed community as multiple users have opened issues and commented on having TPU support for DeepSpeed. This integration would fill a significant gap and enable TPU users to fully leverage DeepSpeed's capabilities.
Pitch
I am willing to spearhead this effort and integrate XLA with DeepSpeed, even without external assistance. A PR will be opened as soon as basic tests pass, and this request will be updated with progress accordingly.
Alternatives
PyTorch/XLA + Torchrun
HuggingFace Accelerate
These are viable alternatives but lack the advanced parallelism and optimization capabilities offered by DeepSpeed.
Additional context
The integration of XLA with DeepSpeed could open up exciting possibilities for TPU users, such as easier multi-node support, pipeline parallelism, and other optimizations that DeepSpeed provides.
The text was updated successfully, but these errors were encountered:
🚀 Feature
Adding XLA as a backend for DeepSpeed.
Motivation
Most frameworks for working with TPUs only utilize multiprocessing at most, such as Accelerate. Even PyTorch/XLA currently only recently supports FSDP, which is not equivalent to pipeline parallelism within DeepSpeed.
This is frustrating, as figuring out a way to enable multi-node SPMD or FSDP with XLA is challenging due to the lack of documentation and examples. Given DeepSpeed's popularity and its advanced features like pipeline parallelism, it is essential to add support for XLA.
There is strong interest in the DeepSpeed community as multiple users have opened issues and commented on having TPU support for DeepSpeed. This integration would fill a significant gap and enable TPU users to fully leverage DeepSpeed's capabilities.
Pitch
I am willing to spearhead this effort and integrate XLA with DeepSpeed, even without external assistance. A PR will be opened as soon as basic tests pass, and this request will be updated with progress accordingly.
Alternatives
These are viable alternatives but lack the advanced parallelism and optimization capabilities offered by DeepSpeed.
Additional context
The integration of XLA with DeepSpeed could open up exciting possibilities for TPU users, such as easier multi-node support, pipeline parallelism, and other optimizations that DeepSpeed provides.
The text was updated successfully, but these errors were encountered: