Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DLLs / wheel data not installed on Windows sometimes #10605

Closed
polm opened this issue Oct 22, 2021 · 3 comments
Closed

DLLs / wheel data not installed on Windows sometimes #10605

polm opened this issue Oct 22, 2021 · 3 comments
Labels
resolution: not a bug Determined as not a bug in pip type: support User Support

Comments

@polm
Copy link

polm commented Oct 22, 2021

Description

I have a package called fugashi that I distribute wheels for. For the Windows wheels there is a DLL included, but sometimes users report that their pip install succeeded but they do not have the DLL. I have not been able to figure out why this happens and I don't even know what to ask the users to check.

In an unzipped wheel the DLL file is at this path:

fugashi-1.1.1.data/data/lib/site-packages/fugashi/libmecab.dll

It is the only data file in the wheel.

Users get this error when trying to use the package:

ImportError: DLL load failed while importing fugashi: the specified module could not be found

I have found other issues with this error, but usually the problem is that the PATH is wrong and the DLL is installed somewhere but just not visible to Python. In the cases I'm dealing with the DLL seems to not be present on the user's system, or at the very least not in the expected directory, and I haven't been able to find other reports of that.

There is more documentation on my experience with this issue here.

If there is something I can ask users to better diagnose the error please let me know.

Expected behavior

I expect the DLL file to be installed in the same directory as the Python source and other files in the wheel. That is in fact what happens when I install the wheel on Windows on my machine.

pip version

???

Python version

3.7, others

OS

Windows

How to Reproduce

  1. pip install fugashi
  2. import fugashi
  3. error occurs because no DLL

But this doesn't happen on my system, and I don't know what configuration causes it.

Output

Here is example output from a recent user ([issue (in Japanese)](https://github.com/polm/fugashi/issues/42)). Note that most of this is irrelevant, though it's included for completeness. The final error is just the localized version of the error above about "the specified module could not be found".


ImportError                               Traceback (most recent call last)
<ipython-input-2-5dae7eb78200> in <module>
----> 1 model = SentenceBertJapanese("sonoisa/sentence-bert-base-ja-mean-tokens")

<ipython-input-1-1cb8424fffdf> in __init__(self, model_name_or_path, device)
      4 class SentenceBertJapanese:
      5     def __init__(self, model_name_or_path, device=None):
----> 6         self.tokenizer = BertJapaneseTokenizer.from_pretrained(model_name_or_path)
      7         self.model = BertModel.from_pretrained(model_name_or_path)
      8         self.model.eval()

~\AppData\Roaming\Python\Python37\site-packages\transformers\tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
   1718 
   1719         return cls._from_pretrained(
-> 1720             resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
   1721         )
   1722 

~\AppData\Roaming\Python\Python37\site-packages\transformers\tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
   1789         # Instantiate tokenizer.
   1790         try:
-> 1791             tokenizer = cls(*init_inputs, **init_kwargs)
   1792         except OSError:
   1793             raise OSError(

~\AppData\Roaming\Python\Python37\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py in __init__(self, vocab_file, do_lower_case, do_word_tokenize, do_subword_tokenize, word_tokenizer_type, subword_tokenizer_type, never_split, unk_token, sep_token, pad_token, cls_token, mask_token, mecab_kwargs, **kwargs)
    150             elif word_tokenizer_type == "mecab":
    151                 self.word_tokenizer = MecabTokenizer(
--> 152                     do_lower_case=do_lower_case, never_split=never_split, **(mecab_kwargs or {})
    153                 )
    154             else:

~\AppData\Roaming\Python\Python37\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py in __init__(self, do_lower_case, never_split, normalize_text, mecab_dic, mecab_option)
    229 
    230         try:
--> 231             import fugashi
    232         except ModuleNotFoundError as error:
    233             raise error.__class__(

~\AppData\Roaming\Python\Python37\site-packages\fugashi\__init__.py in <module>
----> 1 from .fugashi import *
      2 

ImportError: DLL load failed: 指定されたモジュールが見つかりません。


### Code of Conduct

- [X] I agree to follow the [PSF Code of Conduct](https://www.python.org/psf/conduct/).
@polm polm added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Oct 22, 2021
@uranusjr
Copy link
Member

uranusjr commented Oct 22, 2021

You should not use the data key to install additional dlls into site-packages, because the data directory does not always point to the environment's root. Specifically, the nt_user scheme (which is likely the one in use judging from the AppData\Roaming part) puts the data directory in userbase, so

fugashi-1.1.1.data/data/lib/site-packages/fugashi/libmecab.dll

becomes

~\AppData\Roaming\Python\lib\site-packages\fugashi\libmecab.dll

which is not what you want because your package is expecting

~\AppData\Roaming\Python\Python37\lib\site-packages\fugashi\libmecab.dll

You should use the platlib key instead. I am not sure how you can do it with setuptools; please ask the setuptools maintainers.

@uranusjr uranusjr added resolution: not a bug Determined as not a bug in pip type: support User Support and removed type: bug A confirmed bug or unintended behavior S: needs triage Issues/PRs that need to be triaged labels Oct 22, 2021
@polm
Copy link
Author

polm commented Oct 27, 2021

Thank you for the answer to this, I've followed up with the Setuptools maintainers.

I have one question - what determines whether a Windows user has userbase or not? I see the code you linked to checks env vars, but is there some documentation about what kinds of Python installs use that and which don't?

@uranusjr
Copy link
Member

uranusjr commented Oct 27, 2021

A Windows user almost always has a userbase (unless they explicitly disabled it via means like PYTHONNOUSERBASE, which is extremely uncommon). The difference is not whether a userbase is present, but whether the package is installed to the userbase (instead of the global base). pip installs packages to the global base by default, but falls back to userbase if the global base is not writable (e.g. permission issues). The user can also explicitly choose to install any package to userbase with pip install --user.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
resolution: not a bug Determined as not a bug in pip type: support User Support
Projects
None yet
Development

No branches or pull requests

2 participants