alpaca_lora_4bit/GPTQ-for-LLaMa
kooshi 8e471516b8
distributed data parallelism with torchrun
2023-03-24 23:56:06 -05:00
..
autograd_4bit.py distributed data parallelism with torchrun 2023-03-24 23:56:06 -05:00
gradient_checkpointing.py Add gradient checkpointing 2023-03-23 08:25:29 +00:00
quant_cuda.cpp add fast_4bit_matmul and auto switch 2 methods according to bottleneck 2023-03-21 08:43:07 +00:00
quant_cuda_kernel.cu fix bug 2023-03-23 23:37:39 +08:00