Did any one succeed on getting OPT-13B-Erebus to work on 3080ti? #94

ye7iaserag · 2023-02-20T22:26:02Z

ye7iaserag
Feb 20, 2023

I'm using python server.py --auto-devices --cai-chat --load-in-8bit --listen --listen-port=8888
But I'm getting

 Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit
the quantized model. If you have set a value for `max_memory` you should increase that. To have
 an idea of the modules that are set on the CPU or RAM you can print model.hf_device_map.

I have 12GB of VRAM and 64GB or RAM, and I think those should load the model correctly with --auto-devices
Is there something wrong that I'm doing or is it impossible to get it to run with this hardware?

oobabooga · 2023-02-20T22:30:35Z

oobabooga
Feb 20, 2023
Maintainer

The memory usage of OPT-13B-Erebus with --load-in-8bit for me is 14131MiB.

I am not sure if it is possible to mix offloading with 8bit (I think not), so the best currently available option would be to use deepspeed: https://github.com/oobabooga/text-generation-webui/wiki/DeepSpeed or simply --auto-devices with --gpu-memory 10 or similar.

3 replies

ye7iaserag Feb 20, 2023
Author

deepspeed crashes on startup when loading the pygmalion 6B model, which works normally without deepspeed

ye7iaserag Feb 20, 2023
Author

also does using a sharded model help with offloading to cpu?

oobabooga Feb 20, 2023
Maintainer

also does using a sharded model help with offloading to cpu?

I am not sure, but possibly. It helps with reducing the RAM usage while loading a model to the GPU.

ye7iaserag · 2023-02-20T23:34:06Z

ye7iaserag
Feb 20, 2023
Author

Anyone trying this, I got it working on a system with 3080ti with 64GB ram using the following command
python server.py --auto-devices --cai-chat --gpu-memory 10 --bf16 --listen --listen-port=8888 --model=OPT-13B-Erebus
Also I do not think --auto-devices works with --load-in-8bit, they are mutually exclusive

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Did any one succeed on getting OPT-13B-Erebus to work on 3080ti? #94

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Did any one succeed on getting OPT-13B-Erebus to work on 3080ti? #94

ye7iaserag Feb 20, 2023

Replies: 2 comments · 3 replies

oobabooga Feb 20, 2023 Maintainer

ye7iaserag Feb 20, 2023 Author

ye7iaserag Feb 20, 2023 Author

oobabooga Feb 20, 2023 Maintainer

ye7iaserag Feb 20, 2023 Author

ye7iaserag
Feb 20, 2023

Replies: 2 comments 3 replies

oobabooga
Feb 20, 2023
Maintainer

ye7iaserag Feb 20, 2023
Author

ye7iaserag Feb 20, 2023
Author

oobabooga Feb 20, 2023
Maintainer

ye7iaserag
Feb 20, 2023
Author