Update README.md
This commit is contained in:
parent
4c18a56fc0
commit
f185b90c3e
|
|
@ -33,6 +33,7 @@ It's fast on a 3070 Ti mobile. Uses 5-6 GB of GPU RAM.
|
||||||
* Added monkey patch for text generation webui for fixing initial eos token issue.
|
* Added monkey patch for text generation webui for fixing initial eos token issue.
|
||||||
* Added Flash attention support. (Use --flash-attention)
|
* Added Flash attention support. (Use --flash-attention)
|
||||||
* Added Triton backend to support model using groupsize and act-order. (Use --backend=triton)
|
* Added Triton backend to support model using groupsize and act-order. (Use --backend=triton)
|
||||||
|
* Added g_idx support in cuda backend (need recompile cuda kernel)
|
||||||
|
|
||||||
# Requirements
|
# Requirements
|
||||||
gptq-for-llama <br>
|
gptq-for-llama <br>
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue