John Smith
2f704b93c9
add test result
2023-04-26 18:02:43 +08:00
John Smith
73f51188bf
Update readme
2023-04-26 17:53:26 +08:00
John Smith
97804534b9
fix reference
2023-04-26 17:29:29 +08:00
John Smith
8e5cf08479
fix dependency
2023-04-26 17:17:59 +08:00
John Smith
42ef3484a9
fix _SentinelTokenStoppingCriteria
2023-04-26 17:13:56 +08:00
John Smith
d6791790ed
fix bug
2023-04-26 17:13:45 +08:00
John Smith
1abdc99675
add server
2023-04-26 17:13:00 +08:00
John Smith
633c28fd25
add quant attn v1 support
2023-04-25 12:30:03 +08:00
John Smith
f9c94f27cc
fix bug
2023-04-25 09:21:15 +08:00
John Smith
b5af5c00e1
optimize lora compute
2023-04-25 09:18:51 +08:00
John Smith
82bbea2729
optimized matmul for v2 model
2023-04-25 09:18:50 +08:00
John Smith
9fe5ab3642
fix bug
2023-04-22 17:24:07 +08:00
John Smith
4e42965c0d
Merge branch 'main' of github.com:johnsmith0031/alpaca_lora_4bit
2023-04-22 16:35:37 +08:00
John Smith
eb442494d1
optimize mem usage
2023-04-22 16:35:18 +08:00
John Smith
51bf103269
Update README.md
2023-04-22 16:09:38 +08:00
John Smith
33a76b00ca
Update README.md
2023-04-22 15:58:06 +08:00
John Smith
de3c91834e
optimized attention and mlp for performance, add lora monkey patch for models here and GPTQ_For_Llama models using optimization
2023-04-22 15:36:56 +08:00
John Smith
35caccd376
add assert
2023-04-21 10:24:58 +08:00
John Smith
1a0c63edaf
Update README.md
2023-04-20 10:04:13 +08:00
John Smith
a0a0962de7
Update README.md
2023-04-20 09:57:18 +08:00
John Smith
3b18aa1cc6
fix bug and remove bnb
2023-04-20 09:51:57 +08:00
John Smith
90e628121a
fix continue training for this version
2023-04-17 14:16:05 +08:00
John Smith
e64ff9facd
fix bug
2023-04-17 13:42:50 +08:00
John Smith
7a71b0dd12
fix bug when loading old lora model
2023-04-17 12:16:21 +08:00
John Smith
6739f529f5
Merge pull request #79 from wesleysanjose/main
...
Fix Dockerfile for No module named 'monkeypatch'
2023-04-15 00:53:59 +08:00
wesleysanjose
b8e2588fbf
Fix Dockerfile for No module named 'monkeypatch'
...
Traceback (most recent call last):
File "/alpaca_lora_4bit/text-generation-webui/server.py", line 1, in <module>
import custom_monkey_patch # apply monkey patch
File "/alpaca_lora_4bit/text-generation-webui/custom_monkey_patch.py", line 6, in <module>
from monkeypatch.peft_tuners_lora_monkey_patch import replace_peft_model_with_gptq_lora_model, Linear4bitLt
ModuleNotFoundError: No module named 'monkeypatch'
2023-04-14 01:27:44 -07:00
John Smith
fb7665726e
Update requirements.txt
...
Pinned commit hash
2023-04-13 14:44:59 +08:00
John Smith
9c3058c1de
fix bug
2023-04-13 11:34:53 +08:00
John Smith
76d7963dff
fix bug
2023-04-13 10:36:57 +08:00
John Smith
6aab31bd73
update reference
2023-04-13 10:35:10 +08:00
John Smith
5ff11b5bf2
Merge pull request #77 from winglian/upstream-peft
...
use monkey patch instead of forked peft
2023-04-13 10:25:05 +08:00
Wing Lian
f4b1dc19ab
addtional fix
2023-04-12 06:54:23 -04:00
John Smith
17e6a1585f
Update README.md
2023-04-12 13:09:48 +08:00
John Smith
e946f830d4
minor fix
2023-04-12 13:06:30 +08:00
John Smith
4261bd8070
add xformers support
2023-04-12 12:59:44 +08:00
John Smith
7871baf311
fix bug on v1 finetune
2023-04-11 19:15:56 +08:00
John Smith
7762459f1f
Merge pull request #74 from andybarry/readme_fix
...
Fix readme typo
2023-04-10 21:38:06 +08:00
John Smith
68e1b35660
Merge pull request #73 from dnouri/fix-monkeypatch-v1
...
Bugfix in custom_monkey_patch for v1 models
2023-04-10 21:37:27 +08:00
Andy Barry
e590407c5f
Fix readme typo.
2023-04-10 08:56:05 -04:00
Daniel Nouri
ee7d94a1f3
Bugfix in custom_monkey_patch for v1 models
...
Previously generation would fail with:
File "/alpaca_lora_4bit/text-generation-webui/matmul_utils_4bit.py", line 79, in _matmul4bit_v1_recons
quant_cuda.vecquant4recons_v1(qweight, buffer, scales, zeros)
RuntimeError: expected scalar type Half but found Float
See #71
2023-04-10 12:41:16 +02:00
John Smith
5d3267d80d
add v1 model as default in custom monkey patch
2023-04-10 09:33:41 +08:00
Wing Lian
c2b33bacc9
use monkey patch instead of forked peft
2023-04-09 11:40:58 -04:00
John Smith
f185b90c3e
Update README.md
2023-04-09 12:50:49 +08:00
John Smith
4c18a56fc0
fix bug
2023-04-09 12:44:50 +08:00
John Smith
8cf3bd4086
add g_idx support on cuda backend
2023-04-09 12:26:22 +08:00
John Smith
b73f4e5e64
Merge pull request #64 from andybarry/readme_fix
...
Fix URL in readme
2023-04-09 11:15:23 +08:00
Andy Barry
b5d49cb9b1
Fix URL in readme.
2023-04-08 12:38:45 -04:00
John Smith
132c67be0d
Fix bug
2023-04-08 23:58:30 +08:00
John Smith
56e5bf2854
Merge pull request #63 from andybarry/dockerfile
...
Add a Dockerfile and readme changes for quick start
2023-04-08 15:48:27 +08:00
Andy Barry
a93cf1264a
Add timing on readme, remove useless line in dockerfile.
2023-04-08 01:54:29 -04:00