John Smith
35caccd376
add assert
2023-04-21 10:24:58 +08:00
John Smith
90e628121a
fix continue training for this version
2023-04-17 14:16:05 +08:00
John Smith
e64ff9facd
fix bug
2023-04-17 13:42:50 +08:00
John Smith
7a71b0dd12
fix bug when loading old lora model
2023-04-17 12:16:21 +08:00
John Smith
5ff11b5bf2
Merge pull request #77 from winglian/upstream-peft
...
use monkey patch instead of forked peft
2023-04-13 10:25:05 +08:00
Wing Lian
f4b1dc19ab
addtional fix
2023-04-12 06:54:23 -04:00
John Smith
4261bd8070
add xformers support
2023-04-12 12:59:44 +08:00
John Smith
7871baf311
fix bug on v1 finetune
2023-04-11 19:15:56 +08:00
Wing Lian
c2b33bacc9
use monkey patch instead of forked peft
2023-04-09 11:40:58 -04:00
John Smith
8cf3bd4086
add g_idx support on cuda backend
2023-04-09 12:26:22 +08:00
John Smith
dba3773b30
add triton backend support for v2 model
2023-04-07 15:34:06 +08:00
John Smith
9351f49542
merge pull request in new branch
2023-04-07 10:40:24 +08:00
yamashi
c5aa7fb695
Update finetune.py
2023-04-07 00:43:36 +02:00
yamashi
3ea18575c7
Use flash attention monkeypatch
2023-04-06 13:49:12 +02:00
Andrey Glushenkov
f20570343f
GPTQv2 support
...
GPTQv2 support.
1. Adds dependency on `triton`
2. Refactors autograd_4bit to include both GPTQv1 and GPTQv2
3. Introduces new environment variable GPTQ_VERSION to select autograd_4bit version
4. Fixes triton kernels
5. Matrix multiplications are in fp16
2023-04-06 02:29:36 +03:00
John Smith
86387a0a35
update multi gpu support in finetune.py
2023-04-03 23:55:58 +08:00
John Smith
f3a25342e1
fix device_map bug when using lora_apply_dir
2023-03-31 19:44:36 +08:00
Wing Lian
8791eaee9a
fix gpt4all training to more closely match the released logic, other small fixes and optimizations
2023-03-30 22:40:40 -04:00
Wing Lian
b7361da58a
better multi-gpu support, support gpt4all training data
2023-03-29 11:21:47 -04:00
John Smith
1c02d4262d
add resume checkpoint to continue a training
2023-03-29 14:35:39 +08:00
John Smith
2a1cb42966
add padding support as an option
2023-03-29 11:20:16 +08:00
John Smith
0768d0fdff
update finetune data format
2023-03-28 21:45:33 +08:00
John Smith
211af574b6
fix bug
2023-03-28 21:12:51 +08:00
John Smith
bff039de95
add v2 model support
2023-03-28 20:33:55 +08:00
Wing Lian
62e54ac1c7
backwards support for pre-py3.10, add datasets requirement used in train
2023-03-27 16:08:20 -04:00
Star Dorminey
399c3d124e
Tested and should be ready!
2023-03-25 20:52:38 -07:00
kooshi
8e471516b8
distributed data parallelism with torchrun
2023-03-24 23:56:06 -05:00
kooshi
2bc64597aa
model parallelism
2023-03-24 23:03:43 -05:00
John Smith
0879580006
Merge branch 'main' into finetune-refactor
2023-03-25 10:29:02 +08:00
Andrey Glushenkov
397f5041c3
Reflect last changes in main
...
Reflect commits:
4906961bf1
60b227d0ba
2023-03-24 15:46:03 +03:00
Andrey Glushenkov
50dbb101e9
Refactor finetune.py
...
1. Add command line arguments support
2. Add Stanford Alpaca-like dataset support. Used code from - https://github.com/tloen/alpaca-lora
3. Fix LoRA pre-train application
2023-03-24 14:15:07 +03:00
John Smith
60b227d0ba
fix minor bug
2023-03-23 08:43:18 +00:00
John Smith
44978669cf
Add gradient checkpointing
2023-03-23 08:25:29 +00:00
John Smith
dc036373b2
add more scripts and adjust code for transformer branch
2023-03-22 04:09:04 +00:00