alpaca_lora_4bit

Author	SHA1	Message	Date
John Smith	f9c94f27cc	fix bug	2023-04-25 09:21:15 +08:00
John Smith	9fe5ab3642	fix bug	2023-04-22 17:24:07 +08:00
John Smith	de3c91834e	optimized attention and mlp for performance, add lora monkey patch for models here and GPTQ_For_Llama models using optimization	2023-04-22 15:36:56 +08:00
John Smith	3b18aa1cc6	fix bug and remove bnb	2023-04-20 09:51:57 +08:00
John Smith	9c3058c1de	fix bug	2023-04-13 11:34:53 +08:00
John Smith	5ff11b5bf2	Merge pull request #77 from winglian/upstream-peft use monkey patch instead of forked peft	2023-04-13 10:25:05 +08:00
John Smith	4261bd8070	add xformers support	2023-04-12 12:59:44 +08:00
Wing Lian	c2b33bacc9	use monkey patch instead of forked peft	2023-04-09 11:40:58 -04:00
yamashi	2bf5d42f28	Add position_ids to flash attention	2023-04-06 17:46:15 +02:00
yamashi	7770e76c9c	Fix args of flash attention	2023-04-06 17:32:01 +02:00
yamashi	7b18b39dd8	Create llama_flash_attn_monkey_patch.py	2023-04-06 13:49:36 +02:00