Update README.md
This commit is contained in:
parent
8d198e0171
commit
3be75bb3db
|
|
@ -3,7 +3,7 @@ Made some adjust for the code in peft and gptq for llama, and make it possible f
|
||||||
<br>
|
<br>
|
||||||
~Still numerically unstable.~ Resolved.
|
~Still numerically unstable.~ Resolved.
|
||||||
<br>
|
<br>
|
||||||
Reconstruct fp16 matrix from 4bit data and call torch.matmul drastically increased the inference speed.
|
Reconstruct fp16 matrix from 4bit data and call torch.matmul largely increased the inference speed.
|
||||||
<br>
|
<br>
|
||||||
# Requirements
|
# Requirements
|
||||||
gptq-for-llama: https://github.com/qwopqwop200/GPTQ-for-LLaMa<br>
|
gptq-for-llama: https://github.com/qwopqwop200/GPTQ-for-LLaMa<br>
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue