Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torchrec sharding plan setting #2642

Open
shan-jiang-faire opened this issue Dec 17, 2024 · 2 comments
Open

Torchrec sharding plan setting #2642

shan-jiang-faire opened this issue Dec 17, 2024 · 2 comments

Comments

@shan-jiang-faire
Copy link

I'm using torchrec to make model parallelization. Is that possible to manually setting up a group of sparse features to be sharded in the same machine? For example, I have 3 sparse features A, B and C. They are interact frequently in my model so I'd like them to be in the same GPU to reduce the communication between machines. Is there any way to make it in torchrec?

Thanks very much!

@sarckk
Copy link
Member

sarckk commented Dec 20, 2024

Hi @shan-jiang-faire, I'm assuming these 3 features will be in separate embedding tables? afaik, if you're using the TorchRec planner to generate a sharding plan, there isn't an easy way to enforce all 3 tables will be on the same GPU/rank. However, you can use the construct_module_sharding_plan API [source] to manually define a sharding plan such that the 3 tables are table-wise sharded on the same rank.

Example:

plan = construct_module_sharding_plan(
  ebc,
  {
     "table_1": table_wise(rank=0),
     "table_2": table_wise(rank=0),
     "table_3": table_wise(rank=0),
  },
)

Would that work for your use case?

@shan-jiang-faire
Copy link
Author

Thank you very much! Yes, they are in separate tables. I think it is what I need here. Let me try it. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants