How does one convert alpaca Llama for use with this repo? #502

arijoon · 2023-03-22T21:34:41Z

arijoon
Mar 22, 2023

I'm quiet new to text generation and somewhat struggling with model conversions. I could convert the original Meta Llama models, however how does one convert the ggml-alpaca-7b-q4.bin file taken from https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca to be compatible with this repo to run on the GPU instead? (basically how do I get the 7B or 13B alpaca model and run it here)

Answered by BetaDoggo

Mar 23, 2023

There is no way to convert the 4bit ggml models without loss because they use a different method for quantization. You'll have to merge the Lora using the alpaca-lora repo then quantize it to 4bit using the GPTQ-for-llama repo.

View full answer

NovNovikov · 2023-03-22T23:58:23Z

NovNovikov
Mar 22, 2023

I am also struggling with that. I managed to run Llama-7b in 4 bit mode, but Alpaca-7b is not working. Would be glad to have some instructions.

0 replies

BetaDoggo · 2023-03-23T00:26:24Z

BetaDoggo
Mar 23, 2023

There is no way to convert the 4bit ggml models without loss because they use a different method for quantization. You'll have to merge the Lora using the alpaca-lora repo then quantize it to 4bit using the GPTQ-for-llama repo.

4 replies

arijoon Mar 23, 2023
Author

Is there a quantized version of this model available to download anywhere? (maybe on HF?) Or is merging and quantizing from ggml the only option currently?

BetaDoggo Mar 23, 2023

Elinas has converted the 13b and 30b loras, I don't know of any trustworthy sources for the 7b version. There is this one for dep's 7b alpaca-native finetune which should perform better than the lora.

NovNovikov Mar 24, 2023

13B is fine and loading(and then crushing with CUDA out of memory while attempting to generate text on my 3060 ti, but that is a different story) but the 7B model from your link is incompatible - it gives the "size mismatch for model.layers.31.mlp.down_proj.scales:"
Another 7B 4 bit model found on hugging face gives exactly the same error.
Looks like there is no publicly available compatible alpaca 7B 4-bit model.

BetaDoggo Mar 24, 2023

Based on the description I think the alpaca-native one was converted with the group size flag which isn't supported yet. Once this pr is merged I think it should work. All of the quant stuff is kind of a mess for compatibility right now because changes are still being made and to benefit from them all of the models have to be requantized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does one convert alpaca Llama for use with this repo? #502

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How does one convert alpaca Llama for use with this repo? #502

arijoon Mar 22, 2023

Replies: 2 comments · 4 replies

NovNovikov Mar 22, 2023

BetaDoggo Mar 23, 2023

arijoon Mar 23, 2023 Author

BetaDoggo Mar 23, 2023

NovNovikov Mar 24, 2023

BetaDoggo Mar 24, 2023

arijoon
Mar 22, 2023

Replies: 2 comments 4 replies

NovNovikov
Mar 22, 2023

BetaDoggo
Mar 23, 2023

arijoon Mar 23, 2023
Author