Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot install async_io op even if it's compatible flag is displaying OK by ds_report cmd! #6920

Open
LZhengguo opened this issue Dec 31, 2024 · 1 comment
Assignees
Labels
bug Something isn't working build Improvements to the build and testing systems.

Comments

@LZhengguo
Copy link

LZhengguo commented Dec 31, 2024

When i use DS_BUILD_AIO=1 CFLAGS="-I$CONDA_PREFIX/include/ -I/usr/include/" LDFLAGS="-L$CONDA_PREFIX/lib/ -L/usr/lib/x86_64-linux-gnu/" pip install -e . to install async_io op, i get fake successful msg.
it indeed displays Successfully installed deepspeed , but i use ds_report and only get Image .

And i use print stderr msg and i find that Image

To figure out how to result in this case's coming. I read the source code such as "setup.py"...
and i find problem in "setup.py line 182"
for op_name, builder in ALL_OPS.items(): op_compatible = builder.is_compatible()
When op_name is "async_io", builder.is_compatible() returns false. And i open the "DeepSpeed/deepspeed/ops/op_builder/async_io.py" and find "line 93" def is_compatible(self, verbose=False) . It's result depends on "line 99" aio_compatible = self.has_function('io_submit', ('aio', )) .
Go on to find def has_function() in "DeepSpeed/deepspeed/ops/op_builder/builder.py line308" , and i confirm it raise linkerror in line362
compiler.link_executable(objs, os.path.join(tempdir, 'a.out'), extra_preargs=self.strip_empty_entries(ldflags), libraries=libraries, library_dirs=library_dirs) by "distutils.unixccompiler.UnixCCompiler"
I don't know why it happened and to address this issue i had to change the "class AsyncIOBuilder"("DeepSpeed/deepspeed/ops/op_builder/async_io.py") like the following picture Image .

And i install it again and get the correct result.Image

I hope u can figure out why it caused link error. And i don't know my change whether to cause aio disabled when i use offload.

@LZhengguo LZhengguo changed the title {{ env.GITHUB_WORKFLOW }} Cannot install async_io op even if it's compatible flag is displaying OK by ds_report cmd! Cannot install async_io op even if it's compatible flag is displaying OK by ds_report cmd! Dec 31, 2024
@loadams
Copy link
Contributor

loadams commented Jan 2, 2025

Hi @LZhengguo, can you please share your pip list and the verison of lib_aio that you have installed as well as your OS?

@loadams loadams added bug Something isn't working build Improvements to the build and testing systems. and removed ci-failure labels Jan 2, 2025
@loadams loadams self-assigned this Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Improvements to the build and testing systems.
Projects
None yet
Development

No branches or pull requests

2 participants