yamashi
|
778035152d
|
Update arg_parser.py
|
2023-04-07 00:42:34 +02:00 |
yamashi
|
2bf5d42f28
|
Add position_ids to flash attention
|
2023-04-06 17:46:15 +02:00 |
yamashi
|
7770e76c9c
|
Fix args of flash attention
|
2023-04-06 17:32:01 +02:00 |
yamashi
|
30bf938d03
|
Update requirements.txt
|
2023-04-06 13:50:25 +02:00 |
yamashi
|
7b18b39dd8
|
Create llama_flash_attn_monkey_patch.py
|
2023-04-06 13:49:36 +02:00 |
yamashi
|
3ea18575c7
|
Use flash attention monkeypatch
|
2023-04-06 13:49:12 +02:00 |
John Smith
|
8020b3ec3b
|
Update README.md
|
2023-04-06 13:57:32 +08:00 |
John Smith
|
9a02a88fb8
|
add patch for encode function to remove eos token at the beginning of left side
|
2023-04-06 12:56:27 +08:00 |
John Smith
|
085d9556f9
|
fix bug
|
2023-04-06 10:46:42 +08:00 |
John Smith
|
86387a0a35
|
update multi gpu support in finetune.py
|
2023-04-03 23:55:58 +08:00 |
John Smith
|
5655f218ed
|
add g_idx buffer.\nadd triton matmul utils for future support.
|
2023-04-02 21:29:06 +08:00 |
John Smith
|
f3a25342e1
|
fix device_map bug when using lora_apply_dir
|
2023-03-31 19:44:36 +08:00 |
John Smith
|
00bf0a1e1b
|
Update README.md
|
2023-03-31 14:17:35 +08:00 |
John Smith
|
dd0efc721f
|
Merge pull request #47 from winglian/better-gpt4all
fix gpt4all training to more closely match the released logic, other small fixes and optimizations
|
2023-03-31 11:20:03 +08:00 |
Wing Lian
|
8791eaee9a
|
fix gpt4all training to more closely match the released logic, other small fixes and optimizations
|
2023-03-30 22:40:40 -04:00 |
John Smith
|
878eada8dd
|
add amp_wrapper for autocast support.
|
2023-03-30 19:57:19 +08:00 |
John Smith
|
b3c91a5af5
|
Merge pull request #45 from winglian/fix-missing-bracket
fix missing paren
|
2023-03-30 13:53:55 +08:00 |
Wing Lian
|
e744aec8bf
|
fix missing paren
|
2023-03-29 23:40:30 -04:00 |
John Smith
|
8db4633d84
|
Update README.md
|
2023-03-30 11:24:25 +08:00 |
John Smith
|
8a62560e6c
|
add offload support
|
2023-03-30 11:21:21 +08:00 |
John Smith
|
32976f91c4
|
Merge pull request #42 from winglian/multigpu-fix
better multi-gpu support, support gpt4all training data
|
2023-03-30 00:03:27 +08:00 |
Wing Lian
|
b7361da58a
|
better multi-gpu support, support gpt4all training data
|
2023-03-29 11:21:47 -04:00 |
John Smith
|
0fdae9224c
|
optimized groupsize backward for performance
|
2023-03-29 17:44:51 +08:00 |
John Smith
|
5986649b37
|
Update README.md
|
2023-03-29 14:46:28 +08:00 |
John Smith
|
1c02d4262d
|
add resume checkpoint to continue a training
|
2023-03-29 14:35:39 +08:00 |
John Smith
|
2a1cb42966
|
add padding support as an option
|
2023-03-29 11:20:16 +08:00 |
John Smith
|
cff57ebfa4
|
Merge pull request #39 from winglian/fix-prompt-eos-token
properly include the eos token so inference doesn't blabber on
|
2023-03-29 10:35:46 +08:00 |
Wing Lian
|
daad59f8ef
|
properly include the eos token so inference doesn't blabber on
|
2023-03-28 20:53:16 -04:00 |
John Smith
|
1719bd0ce3
|
fix bug
|
2023-03-29 08:09:40 +08:00 |
John Smith
|
1043ded7d9
|
Merge branch 'main' of github.com:johnsmith0031/alpaca_lora_4bit
|
2023-03-29 01:26:20 +08:00 |
John Smith
|
d28ee06202
|
fix bug
|
2023-03-29 01:25:37 +08:00 |
John Smith
|
b5e3dae573
|
Merge pull request #34 from winglian/v2-fixes
fixes for most recent update
|
2023-03-28 23:49:56 +08:00 |
Wing Lian
|
b47da33084
|
fixes for most recent update
|
2023-03-28 10:56:35 -04:00 |
John Smith
|
234004ceb5
|
fix bug
|
2023-03-28 22:05:18 +08:00 |
John Smith
|
f26615fc0c
|
fix bug
|
2023-03-28 21:47:22 +08:00 |
John Smith
|
0768d0fdff
|
update finetune data format
|
2023-03-28 21:45:33 +08:00 |
John Smith
|
8a6c8661df
|
Merge branch 'main' of github.com:johnsmith0031/alpaca_lora_4bit
merged
|
2023-03-28 21:14:35 +08:00 |
John Smith
|
211af574b6
|
fix bug
|
2023-03-28 21:12:51 +08:00 |
John Smith
|
ac07457473
|
Update README.md
|
2023-03-28 20:44:02 +08:00 |
John Smith
|
bff039de95
|
add v2 model support
|
2023-03-28 20:33:55 +08:00 |
John Smith
|
667e43cb5b
|
Merge pull request #30 from winglian/features/python-fixes
backwards support for pre-py3.10, add datasets requirement used in train
|
2023-03-28 09:34:50 +08:00 |
Wing Lian
|
101d314bd9
|
add missing dependency to train with LlamaTokenizer
|
2023-03-27 16:13:46 -04:00 |
Wing Lian
|
62e54ac1c7
|
backwards support for pre-py3.10, add datasets requirement used in train
|
2023-03-27 16:08:20 -04:00 |
John Smith
|
6c8c07e7ad
|
Update README.md
|
2023-03-27 18:03:28 +08:00 |
John Smith
|
cf94d7af68
|
Update README.md
|
2023-03-27 17:52:35 +08:00 |
John Smith
|
1ca9b8abf8
|
Update README.md
|
2023-03-27 17:51:04 +08:00 |
John Smith
|
0b5b376de1
|
Merge pull request #23 from sterlind/star/repos
Get dependencies straight from pip!
|
2023-03-27 17:47:39 +08:00 |
Star Dorminey
|
399c3d124e
|
Tested and should be ready!
|
2023-03-25 20:52:38 -07:00 |
Star Dorminey
|
a2a4c1d117
|
Remove gitmodules.
|
2023-03-25 20:23:46 -07:00 |
Star Dorminey
|
96440c8717
|
Removing submodules actually.
|
2023-03-25 20:20:38 -07:00 |