From 3be75bb3db8eee8c0b24d47c0b41a44bd7f457e0 Mon Sep 17 00:00:00 2001 From: John Smith Date: Tue, 21 Mar 2023 16:49:08 +0800 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bf8eac8..8174a4f 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ Made some adjust for the code in peft and gptq for llama, and make it possible f
~Still numerically unstable.~ Resolved.
-Reconstruct fp16 matrix from 4bit data and call torch.matmul drastically increased the inference speed. +Reconstruct fp16 matrix from 4bit data and call torch.matmul largely increased the inference speed.
# Requirements gptq-for-llama: https://github.com/qwopqwop200/GPTQ-for-LLaMa