John Smith
|
f9c94f27cc
|
fix bug
|
2023-04-25 09:21:15 +08:00 |
John Smith
|
9fe5ab3642
|
fix bug
|
2023-04-22 17:24:07 +08:00 |
John Smith
|
de3c91834e
|
optimized attention and mlp for performance, add lora monkey patch for models here and GPTQ_For_Llama models using optimization
|
2023-04-22 15:36:56 +08:00 |
John Smith
|
3b18aa1cc6
|
fix bug and remove bnb
|
2023-04-20 09:51:57 +08:00 |
John Smith
|
9c3058c1de
|
fix bug
|
2023-04-13 11:34:53 +08:00 |
John Smith
|
5ff11b5bf2
|
Merge pull request #77 from winglian/upstream-peft
use monkey patch instead of forked peft
|
2023-04-13 10:25:05 +08:00 |
John Smith
|
4261bd8070
|
add xformers support
|
2023-04-12 12:59:44 +08:00 |
Wing Lian
|
c2b33bacc9
|
use monkey patch instead of forked peft
|
2023-04-09 11:40:58 -04:00 |
yamashi
|
2bf5d42f28
|
Add position_ids to flash attention
|
2023-04-06 17:46:15 +02:00 |
yamashi
|
7770e76c9c
|
Fix args of flash attention
|
2023-04-06 17:32:01 +02:00 |
yamashi
|
7b18b39dd8
|
Create llama_flash_attn_monkey_patch.py
|
2023-04-06 13:49:36 +02:00 |