Update README.md

This commit is contained in:
John Smith 2023-04-22 16:09:38 +08:00 committed by GitHub
parent 33a76b00ca
commit 51bf103269
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 6 additions and 0 deletions

View File

@ -43,6 +43,12 @@ It's fast on a 3070 Ti mobile. Uses 5-6 GB of GPU RAM.
* Removed bitsandbytes from requirements
* Added pip installable branch based on winglian's PR
* Added cuda backend quant attention and fused mlp from GPTQ_For_Llama.
* Added lora patch for GPTQ_For_Llama triton backend.
```
from monkeypatch.gptq_for_llala_lora_monkey_patch import inject_lora_layers
inject_lora_layers(model, lora_path, device, dtype)
```
# Requirements
gptq-for-llama <br>