Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_in_8bit for model quantization #2178

Open
SUMEETRM opened this issue Dec 18, 2024 · 2 comments
Open

load_in_8bit for model quantization #2178

SUMEETRM opened this issue Dec 18, 2024 · 2 comments

Comments

@SUMEETRM
Copy link

Does torchtune allow load_in_8bit or load_in_4bit before performing SFT or DPO fine-tuning on models? If not, what modifications are required to run training on quantized models?

@felipemello1
Copy link
Contributor

felipemello1 commented Dec 21, 2024

Hey @SUMEETRM , good question! Can you share a bit more why you would like to do it? Is it for something like QLoRA, or would you like to train in 8bit/4bit?

@ebsmothers , do you have any insights on this one?

@ebsmothers
Copy link
Contributor

@SUMEETRM thanks for creating the issue. We currently do not support training in precision lower than 16-bit as it generally leads to poor performance. However, load_in_4bit (as used in Hugging Face) is also used for QLoRA-style usage of NF4, which we do support. If you can share a bit more about what you're trying to do I'd be happy to give some more pointers here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants