You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the feature you'd like
I would like to have more customizability within the SageMaker Training Toolkit. With the ability of a new parameter: custom_override (a dict that can be used to override different commands within the toolkit).
How would this feature be used? Please describe.
This would enable users to have more control over their training jobs. For instance, you could use a custom launcher torch.distributed.launch, horovodrun, deepspeed, etc. Moreover, users would be able to add in overrides for special hyperparameters, or other features within custom or prebuilt docker images.
Describe alternatives you've considered
You could technically create a workaround where you scrape the hyperparamaters for a CUSTOM_OVERRIDE flag, remove it from the hyperparameters provided from the sdk, then modify all the commands appropriately. This would definitely be a "hackish" solution and would be harder to communicate to users.
The text was updated successfully, but these errors were encountered:
Describe the feature you'd like
I would like to have more customizability within the SageMaker Training Toolkit. With the ability of a new parameter: custom_override (a dict that can be used to override different commands within the toolkit).
How would this feature be used? Please describe.
This would enable users to have more control over their training jobs. For instance, you could use a custom launcher torch.distributed.launch, horovodrun, deepspeed, etc. Moreover, users would be able to add in overrides for special hyperparameters, or other features within custom or prebuilt docker images.
Describe alternatives you've considered
You could technically create a workaround where you scrape the hyperparamaters for a CUSTOM_OVERRIDE flag, remove it from the hyperparameters provided from the sdk, then modify all the commands appropriately. This would definitely be a "hackish" solution and would be harder to communicate to users.
The text was updated successfully, but these errors were encountered: