Merge branch 'main' of github.com:johnsmith0031/alpaca_lora_4bit
merged
This commit is contained in:
commit
8a6c8661df
17
README.md
17
README.md
|
|
@ -1,30 +1,17 @@
|
||||||
# Alpaca Lora 4bit
|
# Alpaca Lora 4bit
|
||||||
Made some adjust for the code in peft and gptq for llama, and make it possible for lora finetuning with a 4 bits base model. The same adjustment can be made for 2, 3 and 8 bits.
|
Made some adjust for the code in peft and gptq for llama, and make it possible for lora finetuning with a 4 bits base model. The same adjustment can be made for 2, 3 and 8 bits.
|
||||||
<br>
|
|
||||||
* Install Manual by s4rduk4r: https://github.com/s4rduk4r/alpaca_lora_4bit_readme/blob/main/README.md (**NOTE:** don't use the install script, use the requirements.txt instead.)
|
|
||||||
<br>
|
|
||||||
|
|
||||||
|
* Install Manual by s4rduk4r: https://github.com/s4rduk4r/alpaca_lora_4bit_readme/blob/main/README.md (**NOTE:** don't use the install script, use the requirements.txt instead.)
|
||||||
* Also Remember to create a venv if you do not want the packages be overwritten.
|
* Also Remember to create a venv if you do not want the packages be overwritten.
|
||||||
<br>
|
|
||||||
|
|
||||||
# Update Logs
|
# Update Logs
|
||||||
* Resolved numerically unstable issue
|
* Resolved numerically unstable issue
|
||||||
<br>
|
|
||||||
|
|
||||||
* Reconstruct fp16 matrix from 4bit data and call torch.matmul largely increased the inference speed.
|
* Reconstruct fp16 matrix from 4bit data and call torch.matmul largely increased the inference speed.
|
||||||
<br>
|
|
||||||
|
|
||||||
* Added install script for windows and linux.
|
* Added install script for windows and linux.
|
||||||
<br>
|
|
||||||
|
|
||||||
* Added Gradient Checkpointing. Now It can finetune 30b model 4bit on a single GPU with 24G VRAM with Gradient Checkpointing enabled. (finetune.py updated) (but would reduce training speed, so if having enough VRAM this option is not needed)
|
* Added Gradient Checkpointing. Now It can finetune 30b model 4bit on a single GPU with 24G VRAM with Gradient Checkpointing enabled. (finetune.py updated) (but would reduce training speed, so if having enough VRAM this option is not needed)
|
||||||
<br>
|
|
||||||
|
|
||||||
* Added install manual by s4rduk4r
|
* Added install manual by s4rduk4r
|
||||||
<br>
|
|
||||||
|
|
||||||
* Added pip install support by sterlind, preparing to merge changes upstream
|
* Added pip install support by sterlind, preparing to merge changes upstream
|
||||||
<br>
|
* Add V2 model support (with groupsize, both inference + finetune)
|
||||||
|
|
||||||
# Requirements
|
# Requirements
|
||||||
gptq-for-llama: https://github.com/qwopqwop200/GPTQ-for-LLaMa<br>
|
gptq-for-llama: https://github.com/qwopqwop200/GPTQ-for-LLaMa<br>
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue