Update README.md

2023-04-22 16:09:38 +08:00
parent 33a76b00ca
commit 51bf103269
1 changed files with 6 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -43,6 +43,12 @@ It's fast on a 3070 Ti mobile.  Uses 5-6 GB of GPU RAM.
 * Removed bitsandbytes from requirements
 * Added pip installable branch based on winglian's PR
 * Added cuda backend quant attention and fused mlp from GPTQ_For_Llama.
 * Added lora patch for GPTQ_For_Llama triton backend.
 ```
 from monkeypatch.gptq_for_llala_lora_monkey_patch import inject_lora_layers
 inject_lora_layers(model, lora_path, device, dtype)
 ```
 # Requirements
 gptq-for-llama <br>