Commit Graph

158 Commits

Author SHA1 Message Date
John Smith 82bbea2729 optimized matmul for v2 model 2023-04-25 09:18:50 +08:00
John Smith 9fe5ab3642 fix bug 2023-04-22 17:24:07 +08:00
John Smith 4e42965c0d Merge branch 'main' of github.com:johnsmith0031/alpaca_lora_4bit 2023-04-22 16:35:37 +08:00
John Smith eb442494d1 optimize mem usage 2023-04-22 16:35:18 +08:00
John Smith 51bf103269
Update README.md 2023-04-22 16:09:38 +08:00
John Smith 33a76b00ca
Update README.md 2023-04-22 15:58:06 +08:00
John Smith de3c91834e optimized attention and mlp for performance, add lora monkey patch for models here and GPTQ_For_Llama models using optimization 2023-04-22 15:36:56 +08:00
John Smith 35caccd376 add assert 2023-04-21 10:24:58 +08:00
John Smith 1a0c63edaf
Update README.md 2023-04-20 10:04:13 +08:00
John Smith a0a0962de7
Update README.md 2023-04-20 09:57:18 +08:00
John Smith 3b18aa1cc6 fix bug and remove bnb 2023-04-20 09:51:57 +08:00
John Smith 90e628121a fix continue training for this version 2023-04-17 14:16:05 +08:00
John Smith e64ff9facd fix bug 2023-04-17 13:42:50 +08:00
John Smith 7a71b0dd12 fix bug when loading old lora model 2023-04-17 12:16:21 +08:00
John Smith 6739f529f5
Merge pull request #79 from wesleysanjose/main
Fix Dockerfile for No module named 'monkeypatch'
2023-04-15 00:53:59 +08:00
wesleysanjose b8e2588fbf
Fix Dockerfile for No module named 'monkeypatch'
Traceback (most recent call last):
  File "/alpaca_lora_4bit/text-generation-webui/server.py", line 1, in <module>
    import custom_monkey_patch # apply monkey patch
  File "/alpaca_lora_4bit/text-generation-webui/custom_monkey_patch.py", line 6, in <module>
    from monkeypatch.peft_tuners_lora_monkey_patch import replace_peft_model_with_gptq_lora_model, Linear4bitLt
ModuleNotFoundError: No module named 'monkeypatch'
2023-04-14 01:27:44 -07:00
John Smith fb7665726e
Update requirements.txt
Pinned commit hash
2023-04-13 14:44:59 +08:00
John Smith 9c3058c1de fix bug 2023-04-13 11:34:53 +08:00
John Smith 76d7963dff fix bug 2023-04-13 10:36:57 +08:00
John Smith 6aab31bd73 update reference 2023-04-13 10:35:10 +08:00
John Smith 5ff11b5bf2
Merge pull request #77 from winglian/upstream-peft
use monkey patch instead of forked peft
2023-04-13 10:25:05 +08:00
Wing Lian f4b1dc19ab addtional fix 2023-04-12 06:54:23 -04:00
John Smith 17e6a1585f
Update README.md 2023-04-12 13:09:48 +08:00
John Smith e946f830d4 minor fix 2023-04-12 13:06:30 +08:00
John Smith 4261bd8070 add xformers support 2023-04-12 12:59:44 +08:00
John Smith 7871baf311 fix bug on v1 finetune 2023-04-11 19:15:56 +08:00
John Smith 7762459f1f
Merge pull request #74 from andybarry/readme_fix
Fix readme typo
2023-04-10 21:38:06 +08:00
John Smith 68e1b35660
Merge pull request #73 from dnouri/fix-monkeypatch-v1
Bugfix in custom_monkey_patch for v1 models
2023-04-10 21:37:27 +08:00
Andy Barry e590407c5f Fix readme typo. 2023-04-10 08:56:05 -04:00
Daniel Nouri ee7d94a1f3 Bugfix in custom_monkey_patch for v1 models
Previously generation would fail with:

    File "/alpaca_lora_4bit/text-generation-webui/matmul_utils_4bit.py", line 79, in _matmul4bit_v1_recons
      quant_cuda.vecquant4recons_v1(qweight, buffer, scales, zeros)
  RuntimeError: expected scalar type Half but found Float

See #71
2023-04-10 12:41:16 +02:00
John Smith 5d3267d80d add v1 model as default in custom monkey patch 2023-04-10 09:33:41 +08:00
Wing Lian c2b33bacc9 use monkey patch instead of forked peft 2023-04-09 11:40:58 -04:00
John Smith f185b90c3e
Update README.md 2023-04-09 12:50:49 +08:00
John Smith 4c18a56fc0 fix bug 2023-04-09 12:44:50 +08:00
John Smith 8cf3bd4086 add g_idx support on cuda backend 2023-04-09 12:26:22 +08:00
John Smith b73f4e5e64
Merge pull request #64 from andybarry/readme_fix
Fix URL in readme
2023-04-09 11:15:23 +08:00
Andy Barry b5d49cb9b1 Fix URL in readme. 2023-04-08 12:38:45 -04:00
John Smith 132c67be0d
Fix bug 2023-04-08 23:58:30 +08:00
John Smith 56e5bf2854
Merge pull request #63 from andybarry/dockerfile
Add a Dockerfile and readme changes for quick start
2023-04-08 15:48:27 +08:00
Andy Barry a93cf1264a Add timing on readme, remove useless line in dockerfile. 2023-04-08 01:54:29 -04:00
Andy Barry 191d92c940 Clean up diff 2023-04-08 01:27:56 -04:00
Andy Barry 31614fc2c4 Move 7bn changes into dockerfile. 2023-04-08 01:21:17 -04:00
Andy Barry 2e5aaf6dd6 Merge readmes. 2023-04-08 01:14:54 -04:00
Andy Barry e854f5d111 Fix after merge. 2023-04-08 00:53:28 -04:00
Andy Barry 8435b2c7f2 Merge branch 'main' of https://github.com/johnsmith0031/alpaca_lora_4bit 2023-04-07 22:02:54 -04:00
John Smith f91d4cbb59
Update README.md 2023-04-07 16:10:36 +08:00
John Smith b01b10eb4d Colorized output 2023-04-07 15:58:38 +08:00
John Smith 32904da1ff fix bug on triton matmul 2023-04-07 15:50:55 +08:00
John Smith dba3773b30 add triton backend support for v2 model 2023-04-07 15:34:06 +08:00
John Smith 9351f49542 merge pull request in new branch 2023-04-07 10:40:24 +08:00