-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3.0.0b0 #78
v3.0.0b0 #78
Conversation
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipe:
|
@conda-forge-admin, please rerender |
@conda-forge-admin, please rerender |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
Hi! This is the friendly automated conda-forge-webservice. I tried to rerender for you but ran into some issues. Please check the output logs of the latest webservices GitHub actions workflow run for errors. You can also ping conda-forge/core for further assistance or you can try rerendeing locally. This message was generated by GitHub actions workflow run https://github.com/conda-forge/deepmd-kit-feedstock/actions/runs/9783853775. |
…nda-forge-pinning 2024.07.03.17.45.27
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second one is a segfault with openmpi + absl:
2024-07-03T20:00:29.2814841Z + export OMPI_MCA_plm=isolated OMPI_MCA_btl_vader_single_copy_mechanism=none OMPI_MCA_rmaps_base_oversubscribe=yes OMPI_MCA_plm_ssh_agent=false
2024-07-03T20:00:29.2815656Z + OMPI_MCA_plm=isolated
2024-07-03T20:00:29.2815930Z + OMPI_MCA_btl_vader_single_copy_mechanism=none
2024-07-03T20:00:29.2816146Z + OMPI_MCA_rmaps_base_oversubscribe=yes
2024-07-03T20:00:29.2816458Z + OMPI_MCA_plm_ssh_agent=false
2024-07-03T20:00:29.2816990Z + mpiexec -n 1 lmp_mpi -in in.lammps
2024-07-03T20:00:29.4367999Z [1f1c0d740e3f:04385] mca_base_component_repository_open: unable to open mca_btl_openib: librdmacm.so.1: cannot open shared object file: No such file or directory (ignored)
2024-07-03T20:00:29.4559532Z LAMMPS (2 Aug 2023)
2024-07-03T20:00:29.4568122Z OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
2024-07-03T20:00:29.4575016Z using 1 OpenMP thread(s) per MPI task
2024-07-03T20:00:30.0446896Z [1f1c0d740e3f:04385] *** Process received signal ***
2024-07-03T20:00:30.0448123Z [1f1c0d740e3f:04385] Signal: Segmentation fault (11)
2024-07-03T20:00:30.0453336Z [1f1c0d740e3f:04385] Signal code: Address not mapped (1)
2024-07-03T20:00:30.0454407Z [1f1c0d740e3f:04385] Failing at address: 0x8
2024-07-03T20:00:30.0459737Z [1f1c0d740e3f:04385] [ 0] /lib64/libc.so.6(+0x36400)[0x7f43ab885400]
2024-07-03T20:00:30.0462368Z [1f1c0d740e3f:04385] [ 1] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/./python3.9/site-packages/tensorflow/../../../libabsl_flags_reflection.so.2401.0.0(_ZN4absl12lts_2024011614flags_internal12FlagRegistry12RegisterFlagERNS0_15CommandLineFlagEPKc+0x99)[0x7f43a1318e09]
2024-07-03T20:00:30.0464226Z [1f1c0d740e3f:04385] [ 2] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/./python3.9/site-packages/tensorflow/../../../libabsl_flags_reflection.so.2401.0.0(_ZN4absl12lts_2024011614flags_internal23RegisterCommandLineFlagERNS0_15CommandLineFlagEPKc+0x21)[0x7f43a131a5c1]
2024-07-03T20:00:30.0465530Z [1f1c0d740e3f:04385] [ 3] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/./python3.9/site-packages/tensorflow/../../../libabsl_log_flags.so.2401.0.0(+0x3079)[0x7f43a1338079]
2024-07-03T20:00:30.0466184Z [1f1c0d740e3f:04385] [ 4] /lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f43b13a99c3]
2024-07-03T20:00:30.0466604Z [1f1c0d740e3f:04385] [ 5] /lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f43b13ae59e]
2024-07-03T20:00:30.0467010Z [1f1c0d740e3f:04385] [ 6] /lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f43b13a97d4]
2024-07-03T20:00:30.0467376Z [1f1c0d740e3f:04385] [ 7] /lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f43b13adb8b]
2024-07-03T20:00:30.0467893Z [1f1c0d740e3f:04385] [ 8] /lib64/libdl.so.2(+0xfab)[0x7f43a99fcfab]
2024-07-03T20:00:30.0468252Z [1f1c0d740e3f:04385] [ 9] /lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f43b13a97d4]
2024-07-03T20:00:30.0468559Z [1f1c0d740e3f:04385] [10] /lib64/libdl.so.2(+0x15ad)[0x7f43a99fd5ad]
2024-07-03T20:00:30.0468815Z [1f1c0d740e3f:04385] [11] /lib64/libdl.so.2(dlopen+0x31)[0x7f43a99fd041]
2024-07-03T20:00:30.0469716Z [1f1c0d740e3f:04385] [12] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/liblammps.so.0(_ZN9LAMMPS_NS11plugin_loadEPKcPNS_6LAMMPSE+0xa6)[0x7f43acd40df6]
2024-07-03T20:00:30.0470980Z [1f1c0d740e3f:04385] [13] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/liblammps.so.0(_ZN9LAMMPS_NS16plugin_auto_loadEPNS_6LAMMPSE+0x1a6)[0x7f43acd41446]
2024-07-03T20:00:30.0472135Z [1f1c0d740e3f:04385] [14] /home/conda/feedstock_root/build_artifacts/deepmd-kit_1720035897476/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/bin/../lib/liblammps.so.0(_ZN9LAMMPS_NS6LAMMPSC2EiPPcP19ompi_communicator_t+0xefd)[0x7f43ac5fedfd]
2024-07-03T20:00:30.0472657Z [1f1c0d740e3f:04385] [15] lmp_mpi(+0x2217)[0x556b1653b217]
2024-07-03T20:00:30.0472911Z [1f1c0d740e3f:04385] [16] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f43ab871555]
2024-07-03T20:00:30.0473171Z [1f1c0d740e3f:04385] [17] lmp_mpi(+0x2298)[0x556b1653b298]
2024-07-03T20:00:30.0473388Z [1f1c0d740e3f:04385] *** End of error message ***
2024-07-03T20:00:30.3112890Z --------------------------------------------------------------------------
2024-07-03T20:00:30.3113960Z Primary job terminated normally, but 1 process returned
2024-07-03T20:00:30.3114930Z a non-zero exit code. Per user-direction, the job has been aborted.
2024-07-03T20:00:30.3115645Z --------------------------------------------------------------------------
2024-07-03T20:00:32.5233816Z --------------------------------------------------------------------------
2024-07-03T20:00:32.5235155Z mpiexec noticed that process rank 0 with PID 0 on node 1f1c0d740e3f exited on signal 11 (Segmentation fault).
2024-07-03T20:00:32.5235753Z --------------------------------------------------------------------------
2024-07-03T20:00:33.7626910Z WARNING: Tests failed for deepmd-kit-3.0.0b0-cpu_py39hfac8ecd_mpi_openmpi_0.conda - moving package to /home/conda/feedstock_root/build_artifacts/broken
2024-07-03T20:00:33.8070705Z TESTS FAILED: deepmd-kit-3.0.0b0-cpu_py39hfac8ecd_mpi_openmpi_0.conda
Unclear how to fix it or just skip the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that mpich also has the segfault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reproduced on the local machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(gdb) where
#0 0x0000155541a60e09 in absl::lts_20240116::flags_internal::FlagRegistry::RegisterFlag(absl::lts_20240116::CommandLineFlag&, char const*) ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/./python3.11/site-packages/tensorflow/../../../libabsl_flags_reflection.so.2401.0.0
#1 0x0000155541a625c1 in absl::lts_20240116::flags_internal::RegisterCommandLineFlag(absl::lts_20240116::CommandLineFlag&, char const*) ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/./python3.11/site-packages/tensorflow/../../../libabsl_flags_reflection.so.2401.0.0
#2 0x0000155541a80079 in _GLOBAL__sub_I_flags.cc ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/./python3.11/site-packages/tensorflow/../../../libabsl_log_flags.so.2401.0.0
#3 0x0000155555525237 in call_init (env=0x55555567dbf0, argv=0x7fffffffad58, argc=3, l=<optimized out>) at dl-init.c:74
#4 call_init (l=<optimized out>, argc=3, argv=0x7fffffffad58, env=0x55555567dbf0) at dl-init.c:26
#5 0x000015555552532d in _dl_init (main_map=0x555555780eb0, argc=3, argv=0x7fffffffad58, env=0x55555567dbf0) at dl-init.c:121
#6 0x00001555555215c2 in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x15555552bf50 <call_dl_init>,
args=args@entry=0x7fffffffa290) at dl-catch.c:211
#7 0x000015555552beec in dl_open_worker (a=a@entry=0x7fffffffa440) at dl-open.c:827
#8 0x0000155555521523 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffa420,
operate=operate@entry=0x15555552be50 <dl_open_worker>, args=args@entry=0x7fffffffa440) at dl-catch.c:237
#9 0x000015555552c2e4 in _dl_open (file=0x555555780cc0 "/home/jz748/anaconda3/envs/test-deepmd-build/lib/deepmd_lmp/dpplugin.so",
mode=<optimized out>, caller_dlopen=0x155550f40916 <LAMMPS_NS::plugin_load(char const*, LAMMPS_NS::LAMMPS*)+166>, nsid=<optimized out>,
argc=3, argv=0x7fffffffad58, env=0x55555567dbf0) at dl-open.c:903
#10 0x000015554fcc7714 in dlopen_doit () from /lib64/libc.so.6
#11 0x0000155555521523 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffa630, operate=0x15554fcc76b0 <dlopen_doit>,
args=0x7fffffffa6f0) at dl-catch.c:237
#12 0x0000155555521679 in _dl_catch_error (objname=0x7fffffffa698, errstring=0x7fffffffa6a0, mallocedp=0x7fffffffa697, operate=<optimized out>,
args=<optimized out>) at dl-catch.c:256
#13 0x000015554fcc71f3 in _dlerror_run () from /lib64/libc.so.6
#14 0x000015554fcc77cf in dlopen@GLIBC_2.2.5 () from /lib64/libc.so.6
#15 0x0000155550f40916 in LAMMPS_NS::plugin_load(char const*, LAMMPS_NS::LAMMPS*) ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/liblammps.so.0
#16 0x0000155550f40f66 in LAMMPS_NS::plugin_auto_load(LAMMPS_NS::LAMMPS*) ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/liblammps.so.0
#17 0x00001555507fe6ed in LAMMPS_NS::LAMMPS::LAMMPS(int, char**, ompi_communicator_t*) ()
from /home/jz748/anaconda3/envs/test-deepmd-build/bin/../lib/liblammps.so.0
#18 0x0000555555556217 in main ()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a workaround, I pinned tensorflow to 2.15 and pytorch to 2.1, and submitted conda-forge/abseil-cpp-feedstock#79.
…repodata-patches-feedstock#787 Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
@conda-forge-admin please rerender |
…nda-forge-pinning 2024.07.04.09.14.11
@conda-forge-admin please rerender |
…nda-forge-pinning 2024.07.04.09.14.11
@conda-forge-admin please rerender |
…nda-forge-pinning 2024.07.04.09.14.11
Checklist
0
(if the version changed)conda-smithy
(Use the phrase@conda-forge-admin, please rerender
in a comment in this PR for automated rerendering)