load_in_8bit for model quantization #2178

SUMEETRM · 2024-12-18T23:42:34Z

Does torchtune allow load_in_8bit or load_in_4bit before performing SFT or DPO fine-tuning on models? If not, what modifications are required to run training on quantized models?

felipemello1 · 2024-12-21T17:37:50Z

Hey @SUMEETRM , good question! Can you share a bit more why you would like to do it? Is it for something like QLoRA, or would you like to train in 8bit/4bit?

@ebsmothers , do you have any insights on this one?

ebsmothers · 2024-12-23T17:01:01Z

@SUMEETRM thanks for creating the issue. We currently do not support training in precision lower than 16-bit as it generally leads to poor performance. However, load_in_4bit (as used in Hugging Face) is also used for QLoRA-style usage of NF4, which we do support. If you can share a bit more about what you're trying to do I'd be happy to give some more pointers here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

load_in_8bit for model quantization #2178

load_in_8bit for model quantization #2178

SUMEETRM commented Dec 18, 2024

felipemello1 commented Dec 21, 2024 •

edited

Loading

ebsmothers commented Dec 23, 2024

load_in_8bit for model quantization #2178

load_in_8bit for model quantization #2178

Comments

SUMEETRM commented Dec 18, 2024

felipemello1 commented Dec 21, 2024 • edited Loading

ebsmothers commented Dec 23, 2024

felipemello1 commented Dec 21, 2024 •

edited

Loading