Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Setup failed despite GPU being available. #1449

Open
Du-Zhai opened this issue Dec 12, 2024 · 3 comments
Open

CUDA Setup failed despite GPU being available. #1449

Du-Zhai opened this issue Dec 12, 2024 · 3 comments
Labels
Core:setup A bug with respect to a specific setup CUDA Setup waiting for info

Comments

@Du-Zhai
Copy link

Du-Zhai commented Dec 12, 2024

System Info

win10 ,python3.9.10 ,torch2.4.0 ,cuda12.4 ,RTX4060

Reproduction

RuntimeError Traceback (most recent call last)
File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\transformers\utils\import_utils.py:1793, in _LazyModule._get_module(self, module_name)
1792 try:
-> 1793 return importlib.import_module("." + module_name, self.name)
1794 except Exception as e:

File ~\anaconda3\envs\skingpt4_llama2\lib\importlib_init_.py:127, in import_module(name, package)
126 level += 1
--> 127 return _bootstrap._gcd_import(name[level:], package, level)

File :1030, in _gcd_import(name, package, level)

File :1007, in find_and_load(name, import)

File :986, in find_and_load_unlocked(name, import)

File :680, in _load_unlocked(spec)

File :850, in exec_module(self, module)

File :228, in _call_with_frames_removed(f, *args, **kwds)

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\transformers\trainer.py:226
225 if is_peft_available():
--> 226 from peft import PeftModel
229 if is_accelerate_available():

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft_init_.py:22
20 version = "0.5.0"
---> 22 from .auto import (
23 AutoPeftModel,
24 AutoPeftModelForCausalLM,
25 AutoPeftModelForSequenceClassification,
26 AutoPeftModelForSeq2SeqLM,
27 AutoPeftModelForTokenClassification,
28 AutoPeftModelForQuestionAnswering,
29 AutoPeftModelForFeatureExtraction,
30 )
31 from .mapping import (
32 MODEL_TYPE_TO_PEFT_MODEL_MAPPING,
33 PEFT_TYPE_TO_CONFIG_MAPPING,
(...)
36 inject_adapter_in_model,
37 )

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft\auto.py:31
30 from .config import PeftConfig
---> 31 from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING
32 from .peft_model import (
33 PeftModel,
34 PeftModelForCausalLM,
(...)
39 PeftModelForTokenClassification,
40 )

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft\mapping.py:23
22 from .config import PeftConfig
---> 23 from .peft_model import (
24 PeftModel,
25 PeftModelForCausalLM,
26 PeftModelForFeatureExtraction,
27 PeftModelForQuestionAnswering,
28 PeftModelForSeq2SeqLM,
29 PeftModelForSequenceClassification,
30 PeftModelForTokenClassification,
31 )
32 from .tuners import (
33 AdaLoraConfig,
34 AdaLoraModel,
(...)
42 PromptTuningConfig,
43 )

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft\peft_model.py:38
37 from .config import PeftConfig
---> 38 from .tuners import (
39 AdaLoraModel,
40 AdaptionPromptModel,
41 IA3Model,
42 LoraModel,
43 PrefixEncoder,
44 PromptEmbedding,
45 PromptEncoder,
46 )
47 from .utils import (
48 SAFETENSORS_WEIGHTS_NAME,
49 TRANSFORMERS_MODELS_TO_PREFIX_TUNING_POSTPROCESS_MAPPING,
(...)
62 shift_tokens_right,
63 )

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft\tuners_init_.py:21
20 from .adaption_prompt import AdaptionPromptConfig, AdaptionPromptModel
---> 21 from .lora import LoraConfig, LoraModel
22 from .ia3 import IA3Config, IA3Model

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\peft\tuners\lora.py:45
44 if is_bnb_available():
---> 45 import bitsandbytes as bnb
48 @DataClass
49 class LoraConfig(PeftConfig):

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\bitsandbytes_init_.py:7
6 from . import cuda_setup, utils
----> 7 from .autograd._functions import (
8 MatmulLtState,
9 bmm_cublas,
10 matmul,
11 matmul_cublas,
12 mm_cublas,
13 )
14 from .cextension import COMPILED_WITH_CUDA

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\bitsandbytes\autograd_init_.py:1
----> 1 from ._functions import undo_layout, get_inverse_transform_indices

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\bitsandbytes\autograd_functions.py:9
7 import torch
----> 9 import bitsandbytes.functional as F
12 # math.prod not compatible with python < 3.8

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\bitsandbytes\functional.py:17
15 from torch import Tensor
---> 17 from .cextension import COMPILED_WITH_CUDA, lib
20 # math.prod not compatible with python < 3.8

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\bitsandbytes\cextension.py:22
21 CUDASetup.get_instance().print_log_stack()
---> 22 raise RuntimeError('''
23 CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
24 If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
25 https://github.com/TimDettmers/bitsandbytes/issues''')
26 lib.cadam32bit_g32

RuntimeError:
CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
Cell In[2], line 7
5 from transformers.image_utils import load_image
6 DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
----> 7 from transformers import DataCollatorForSeq2Seq,TrainingArguments,Trainer

File :1055, in handle_fromlist(module, fromlist, import, recursive)

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\transformers\utils\import_utils.py:1781, in _LazyModule.getattr(self, name)
1779 value = Placeholder
1780 elif name in self._class_to_module.keys():
-> 1781 module = self._get_module(self._class_to_module[name])
1782 value = getattr(module, name)
1783 elif name in self._modules:

File ~\anaconda3\envs\skingpt4_llama2\lib\site-packages\transformers\utils\import_utils.py:1795, in _LazyModule._get_module(self, module_name)
1793 return importlib.import_module("." + module_name, self.name)
1794 except Exception as e:
-> 1795 raise RuntimeError(
1796 f"Failed to import {self.name}.{module_name} because of the following error (look up to see its"
1797 f" traceback):\n{e}"
1798 ) from e

RuntimeError: Failed to import transformers.trainer because of the following error (look up to see its traceback):

    CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
    If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
    https://github.com/TimDettmers/bitsandbytes/issues

Expected behavior

Just want to run without Errors!

@matthewdouglas
Copy link
Member

Hi @Du-Zhai,
What version of bitsandbytes are you using? I would recommend upgrading to the newest if possible. If that does not solve your issue, please share the output of python -m bitsandbytes.

@matthewdouglas matthewdouglas added waiting for info Core:setup A bug with respect to a specific setup CUDA Setup labels Dec 13, 2024
@daihuaiii
Copy link

Similar problem with

bitsandbytes == 0.44.1
transformers ==4.45.2
(Linux)

And my python -m bitsandbytes result is:

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
CUDA specs: CUDASpecs(highest_compute_capability=(8, 6), cuda_version_string='121', cuda_version_tuple=(12, 1))
PyTorch settings found: CUDA_VERSION=121, Highest Compute Capability: (8, 6).
To manually override the PyTorch CUDA version please see: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx
CUDA SETUP: WARNING! CUDA runtime files not found in any environmental path.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and CUDA is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.
For source installations, compile the binaries with cmake -DCOMPUTE_BACKEND=cuda -S ..
See the documentation for more details if needed.
Trying a simple check anyway, but this will likely fail...
Segmentation fault (core dumped)

@daihuaiii
Copy link

When I use LLaMa-Factory to peft qwen2.5 7B under qlora setting, same issue happened as well with Segmentation fault (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core:setup A bug with respect to a specific setup CUDA Setup waiting for info
Projects
None yet
Development

No branches or pull requests

3 participants