Commit Graph

12 Commits

Author SHA1 Message Date
Forkoz 58998acc9f
Fix cuda kernel for Pascal & Cuda 6/6.1
When I left the other functions to use normal atomic add it seemed like a small speedup. 4.79 it/s vs 5.23 it/s
2023-03-23 07:33:57 -05:00
John Smith 44978669cf Add gradient checkpointing 2023-03-23 08:25:29 +00:00
John Smith dc036373b2 add more scripts and adjust code for transformer branch 2023-03-22 04:09:04 +00:00
John Smith a955a1c2a5
fix bug 2023-03-22 00:18:24 +08:00
John Smith ef0a326cec update autograd 2023-03-21 09:41:18 +00:00
John Smith 3471be4e56 add fast_4bit_matmul and auto switch 2 methods according to bottleneck 2023-03-21 08:43:07 +00:00
John Smith dd0d5a31f7 add half support 2023-03-20 09:37:51 +00:00
John Smith 5b64833390 add half support on cuda kernel 2023-03-20 09:19:05 +00:00
John Smith 04f5575a23
reduced memory usage by a little 2023-03-20 00:51:52 +08:00
John Smith 2b84b32fbe
Update autograd_4bit.py 2023-03-18 22:13:11 +08:00
John Smith 6f4bbb40a9
Update autograd_4bit.py 2023-03-18 18:49:26 +08:00
John Smith 551f62a0e8
add patch for gptq and peft 2023-03-18 13:31:48 +08:00