-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPUOffloadOptimizer incompatible with learning rate schedulers #959
Comments
constant:
|
other info about the optim config. not sure it is helpful. |
Hello @bghira, for CPU offload optimizer, you have to update the LR manually. Yea perhaps we should update the doc to make it clearer. Lmk if you still have problems |
@bghira We can continue the discussion here (instead at the PR) for better visibility
I don't think CPU offload optimizer is mentioned as a drop-in replacement (but it doesn't mean that it shouldn't be). There are already many other caveats that I believe users should be aware of. Regarding the LR schedule issue, like I mentioned in the PR, the issue is that PyTorch's LR scheduler is hard-coded to check for optimizer subclass-ness, and I don't want to make Just curious, from your perspective, would it be too much to ask users to also update the LR schedule code? Since you would already need to modify some code to use the CPU offload optimizer, it doesn't seem much to also change the LR schedule code. |
it would need to be propagated up to the huggingface library eventually but i have plenty of local overrides |
in get_polynomial_decay_schedule_with_warmup
I'm not sure why this isn't working / exposed for external calls, as it works without the offload optimizer class.
The text was updated successfully, but these errors were encountered: