Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

华为昇腾NPU支持QLora训练吗? #6452

Open
1 task done
sunxiaoyu12 opened this issue Dec 26, 2024 · 3 comments
Open
1 task done

华为昇腾NPU支持QLora训练吗? #6452

sunxiaoyu12 opened this issue Dec 26, 2024 · 3 comments
Labels
npu This problem is related to NPU devices pending This problem is yet to be addressed

Comments

@sunxiaoyu12
Copy link

sunxiaoyu12 commented Dec 26, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

llamafactory version: 0.9.2.dev0

  • Platform: Linux-5.15.0-25-generic-aarch64-with-glibc2.34
  • Python version: 3.8.20
  • PyTorch version: 2.1.0 (NPU)
  • Transformers version: 4.46.1
  • Datasets version: 3.1.0
  • Accelerate version: 1.0.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • NPU type: Ascend910B3
  • CANN version: 8.0.RC1
  • DeepSpeed version: 0.15.4

Reproduction

‘’‘
llamafactory-cli train
--stage sft
--do_train True
--model_name_or_path /LLaMA-Factory-main/model/Qwen2.5-7B-Instruct
--preprocessing_num_workers 2
--finetuning_type lora
--template qwen
--flash_attn fa2
--dataset_dir /LLaMA-Factory-main/data
--dataset alpaca_zh_demo
--cutoff_len 1024
--learning_rate 0.000001
--num_train_epochs 5
--max_samples 1000
--per_device_train_batch_size 4
--gradient_accumulation_steps 2
--lr_scheduler_type cosine
--logging_steps 5
--save_steps 100
--output_dir saves/Qwen2.5-7B-Instruct/qlora/sft/train_2024-12-26-01
--quantization_bit 4
--quantization_method bitsandbytes
--deepspeed cache/ds_z3_config.json
’‘’

异常堆栈:
e6f46844a54300811f266aac02f1ca8a

Expected behavior

根据bitsandbytes官方安装指南(https://huggingface.co/docs/bitsandbytes/v0.45.0/en/installation?backend=Ascend+NPU#multi-backend-compile) 安装了NPU版本的,但QLora训练依旧报错。
请问目前支持在NPU上进行QLora训练吗?如果支持的话,需要安装哪些依赖?

Others

No response

@github-actions github-actions bot added pending This problem is yet to be addressed npu This problem is related to NPU devices labels Dec 26, 2024
@sunxiaoyu12
Copy link
Author

目前昇腾不支持QLora

@hiyouga
Copy link
Owner

hiyouga commented Dec 26, 2024

@statelesshz Could you help with this problem?

@statelesshz
Copy link
Contributor

statelesshz commented Dec 26, 2024

@sunxiaoyu12 源码安装一下main分支的transformers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npu This problem is related to NPU devices pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

3 participants