From f185b90c3eda93a35dbac954dabea8bffdc08917 Mon Sep 17 00:00:00 2001 From: John Smith Date: Sun, 9 Apr 2023 12:50:49 +0800 Subject: [PATCH] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 67fe70d..64e6f01 100644 --- a/README.md +++ b/README.md @@ -33,6 +33,7 @@ It's fast on a 3070 Ti mobile. Uses 5-6 GB of GPU RAM. * Added monkey patch for text generation webui for fixing initial eos token issue. * Added Flash attention support. (Use --flash-attention) * Added Triton backend to support model using groupsize and act-order. (Use --backend=triton) +* Added g_idx support in cuda backend (need recompile cuda kernel) # Requirements gptq-for-llama