alpaca_lora_4bit

History

Forkoz 58998acc9f Fix cuda kernel for Pascal & Cuda 6/6.1 When I left the other functions to use normal atomic add it seemed like a small speedup. 4.79 it/s vs 5.23 it/s		2023-03-23 07:33:57 -05:00
..
autograd_4bit.py	add more scripts and adjust code for transformer branch	2023-03-22 04:09:04 +00:00
gradient_checkpointing.py	Add gradient checkpointing	2023-03-23 08:25:29 +00:00
quant_cuda.cpp	add fast_4bit_matmul and auto switch 2 methods according to bottleneck	2023-03-21 08:43:07 +00:00
quant_cuda_kernel.cu	Fix cuda kernel for Pascal & Cuda 6/6.1	2023-03-23 07:33:57 -05:00