Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go\pkg\mod\github.com\!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:25:29: undefined: Return #49

Open
djsxianglei opened this issue Aug 25, 2022 · 18 comments

Comments

@djsxianglei
Copy link

go get github.com/NVIDIA/go-nvml/pkg/nvml error
D:\www\go-nvml>go get github.com/NVIDIA/go-nvml/pkg/nvml

github.com/NVIDIA/go-nvml/pkg/nvml

C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:25:29: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:32:49: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:39:54: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:46:50: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:53:58: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:60:44: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:66:41: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:71:37: undefined: BrandType
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:71:48: undefined: Return
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\types_gen.go:9:10: undefined: _Ctype_struct_nvmlDevice_st
C:\Users\djs\go\pkg\mod\github.com!n!v!i!d!i!a\go-nvml@v0.11.6-0\pkg\nvml\device.go:71:48: too many errors

@elezar
Copy link
Member

elezar commented Aug 25, 2022

@djsxianglei it seems as if you are running this on a Windows machine. As far as I am aware there is platform specific which has not yet been updated to support windows. We do have an issue open to track this (see #1) and any contributions would be welcome.

@djsxianglei
Copy link
Author

@elezar thanks.I tried it in a linux environment.

@elezar
Copy link
Member

elezar commented Nov 2, 2022

@djsxianglei did switching to Linux solve your issues?

@shaktsin
Copy link

shaktsin commented Jan 11, 2023

I am using a linux container and it fails with the following error

/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-0/pkg/dl/dl.go:34:18: could not determine kind of name for C.RTLD_DEEPBIND

@klueska
Copy link
Contributor

klueska commented Jan 12, 2023

RTLD_DEEPBIND should be available as of glibc 2.3.4.
What version of glibc do you have in your development environment where you are trying to compile this?

@shaktsin
Copy link

v1.2.2

@klueska
Copy link
Contributor

klueska commented Jan 12, 2023

that doesn't sound like a glibc version to me, but rather a musl libc version (on which NVML is not supported).

@roma-glushko
Copy link

I'm having the same set of errors during building phase in an app that uses the nvml bindings.
The build process happens in a docker container (because I'm on MacOS) created by this image:

# syntax=docker/dockerfile:1
FROM nvidia/cuda:12.2.0-devel-ubuntu22.04

RUN apt-get update -y -q && apt-get upgrade -y -q
RUN DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y -q curl build-essential ca-certificates git

RUN curl -s https://storage.googleapis.com/golang/go1.20.4.linux-amd64.tar.gz | tar -v -C /usr/local -xz
ENV PATH $PATH:/usr/local/go/bin

RUN curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.53.3

WORKDIR /service

COPY go.mod go.sum main.go /service/
RUN go mod download

The exact build command then looks this way:

GOOS?=darwin
COMMIT ?= $(shell git describe --dirty --long --always)
VERSION := $(shell cat ./VERSION)
LDFLAGS_COMMON := -X main.commitSha=$(COMMIT) -X main.version=$(VERSION) -s -w

build: ## Build a binary
	@CGO_ENABLED=0 GOARCH=amd64 go build -ldflags "$(LDFLAGS_COMMON)" -o ./dist/resbeat

linux-%: image-build
	@docker run --rm -v "$(PWD)":/service -w /service -e GOOS=linux romahlushko/resbeat-build:latest make $*


# make linux-build

I'm ending up getting this error:

[+] Building 3.9s (15/15) FINISHED                                                                                                                                                                                      
 => [internal] load build definition from build.Dockerfile                                                                                                                                                         0.1s
 => => transferring dockerfile: 638B                                                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                                                    0.0s
 => resolve image config for docker.io/docker/dockerfile:1                                                                                                                                                         2.9s
 => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14                                                                                    0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:12.2.0-devel-ubuntu22.04                                                                                                                                    0.7s
 => [1/8] FROM docker.io/nvidia/cuda:12.2.0-devel-ubuntu22.04@sha256:0e2d7e252847c334b056937e533683556926f5343a472b6b92f858a7af8ab880                                                                              0.0s
 => [internal] load build context                                                                                                                                                                                  0.0s
 => => transferring context: 81B                                                                                                                                                                                   0.0s
 => CACHED [2/8] RUN apt-get update -y -q && apt-get upgrade -y -q                                                                                                                                                 0.0s
 => CACHED [3/8] RUN DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y -q curl build-essential ca-certificates git                                                                         0.0s
 => CACHED [4/8] RUN curl -s https://storage.googleapis.com/golang/go1.20.4.linux-amd64.tar.gz | tar -v -C /usr/local -xz                                                                                          0.0s
 => CACHED [5/8] RUN curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.53.3                                                              0.0s
 => CACHED [6/8] WORKDIR /service                                                                                                                                                                                  0.0s
 => CACHED [7/8] COPY go.mod go.sum main.go /service/                                                                                                                                                              0.0s
 => CACHED [8/8] RUN go mod download                                                                                                                                                                               0.0s
 => exporting to image                                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                                            0.0s
 => => writing image sha256:aa35910e75093c11c5c1bf04c44f1b0418b84905a1ee2f2981731b81e26a46d3                                                                                                                       0.0s
 => => naming to docker.io/romahlushko/resbeat-build                                                                                                                                                               0.0s

==========
== CUDA ==
==========

CUDA Version 12.2.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

# github.com/NVIDIA/go-nvml/pkg/nvml
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:9:10: undefined: _Ctype_struct_nvmlDevice_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:320:10: undefined: _Ctype_struct_nvmlUnit_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:358:10: undefined: _Ctype_struct_nvmlEventSet_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:505:10: undefined: _Ctype_struct_nvmlGpuInstance_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:548:10: undefined: _Ctype_struct_nvmlComputeInstance_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/types_gen.go:552:10: undefined: _Ctype_struct_nvmlGpmSample_st
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/device.go:22:19: undefined: MemoryErrorType
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/device.go:25:29: undefined: Return
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/device.go:32:49: undefined: Return
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/device.go:39:54: undefined: Return
/root/go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/nvml/device.go:39:54: too many errors
make: *** [Makefile:12: build] Error 1

The error occurs when I'm compiling with CGO_ENABLED=0, otherwise, anther errors occur:

./resbeat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./resbeat)
./resbeat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by ./resbeat)

By the end of the day, I want to get this NVML integration completely option in the app, so the app could be ran in environments without GPU/NVIDIA libraries while supporting more capabilities if those pieces are present. So what is the best way to achieve that besides having ifs that would guard calling of the nvml bindings?

@elezar
Copy link
Member

elezar commented Jul 17, 2023

@roma-glushko the following is an example of a Golang ap that we build which consumes go-nvml: https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/container/Dockerfile.ubuntu

We build this on MacOS regularly. Note that we also privide the following build flags:

https://github.com/NVIDIA/k8s-device-plugin/blob/8b4160169defedbc95beb2f56f1cb660b510d28a/Makefile#L58-L59

To ensure that this executable does not complain about missing symbols.

@roma-glushko
Copy link

@elezar thank you, Evan! This is probably what I needed. Let me try it myself and get back to you.

P.S. You may consider referencing this somewhere in the readme as a vetted example of using nvml-go library. That should be helpful 🙌

@roma-glushko
Copy link

@elezar Hey Evan, I have tried to add those additional env var, but it doesn't seem to help me to build the app on Mac:

// this is the new command I have ended up trying:
CGO_LDFLAGS_ALLOW="-Wl,--unresolved-symbols=ignore-in-object-files"  GOOS=darwin GOARCH=amd64 \
                go build -ldflags "-s -w -X main.commitSha=1.0.2-8-ga481f406f6f1016-dirty -X main.version=1.0.3" -o ./dist/resbeat

# github.com/NVIDIA/go-nvml/pkg/dl
../../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-1/pkg/dl/dl.go:34:18: could not determine kind of name for C.RTLD_DEEPBIND
make: *** [build] Error 1

I feel like those additional flags CGO_LDFLAGS_ALLOW="-Wl,--unresolved-symbols=ignore-in-object-files" did not really change the situation for some reason.

Then I have gone for another test and pulled the repo you have referenced. This is what I could see trying to run make cmds (this is with GOOS=darwin ):

Screenshot 2023-07-23 at 14 23 59

With GOOS=linux (the default in the makefile), I'm getting this error:

Screenshot 2023-07-23 at 14 31 45

So I'm really wondering how do you build and run apps with nvml-go bindings imported on Mac.

@elezar
Copy link
Member

elezar commented Jul 24, 2023

We have not tested go-nvml on Mac and usually build applications in a docker container. You should be able to build applications that consume go-nvml on a Mac by wrapping the imported code in Linux-only files. This assumes that the go-nvml functionality is not required on mac. Note that the Device Plugin that you are trying to build does not do this and also imports other linux-only packages.

@roma-glushko
Copy link

@asm582
Copy link

asm582 commented Apr 24, 2024

Hello, I get a similar error when building it on Linux system:

/bin/controller-gen-v0.14.0 object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
# github.com/NVIDIA/go-nvml/pkg/nvml
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:9:10: undefined: _Ctype_struct_nvmlDevice_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:320:10: undefined: _Ctype_struct_nvmlUnit_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:358:10: undefined: _Ctype_struct_nvmlEventSet_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:505:10: undefined: _Ctype_struct_nvmlGpuInstance_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:548:10: undefined: _Ctype_struct_nvmlComputeInstance_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/types_gen.go:552:10: undefined: _Ctype_struct_nvmlGpmSample_st
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/device.go:22:19: undefined: MemoryErrorType
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/device.go:25:29: undefined: Return
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/device.go:32:49: undefined: Return
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/device.go:39:54: undefined: Return
../../go/pkg/mod/github.com/!n!v!i!d!i!a/go-nvml@v0.12.0-3/pkg/nvml/device.go:39:54: too many errors

I set the below env variables:

GOOS?=linux
GOARCH?=arm64
CGO_ENABLED?=0
CLI_VERSION_PACKAGE := main
COMMIT ?= $(shell git describe --dirty --long --always --abbrev=15)
CGO_LDFLAGS_ALLOW := "-Wl,--unresolved-symbols=ignore-in-object-files"
LDFLAGS_COMMON := "-s -w -X $(CLI_VERSION_PACKAGE).commitSha=$(COMMIT) -X $(CLI_VERSION_PACKAGE).version=$(VERSION)"

I run make build with the below args:

.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
	@CGO_LDFLAGS_ALLOW=$(CGO_LDFLAGS_ALLOW) CGO_ENABLED=$(CGO_ENABLED) GOOS=$(GOOS) GOARCH=$(GOARCH) \
		go build -ldflags $(LDFLAGS_COMMON) -o bin/manager cmd/main.go

any pointers?

@elezar
Copy link
Member

elezar commented Apr 24, 2024

@asm582 does setting CGO_ENABLED?=0 not disable cgo? Since this package represents bindings for the C-based livnvidia-ml.so library, cgo is required.

@asm582
Copy link

asm582 commented Apr 24, 2024

Ok I did CGO_ENABLED?=1

now I get below error:

make build
/home/openstack/asmalvan/instaslice2/bin/controller-gen-v0.14.0 rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases
/home/openstack/asmalvan/instaslice2/bin/controller-gen-v0.14.0 object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
# runtime/cgo
gcc_arm64.S: Assembler messages:
gcc_arm64.S:30: Error: no such instruction: `stp x29,x30,[sp,'
gcc_arm64.S:34: Error: too many memory references for `mov'
gcc_arm64.S:36: Error: no such instruction: `stp x19,x20,[sp,'
gcc_arm64.S:39: Error: no such instruction: `stp x21,x22,[sp,'
gcc_arm64.S:42: Error: no such instruction: `stp x23,x24,[sp,'
gcc_arm64.S:45: Error: no such instruction: `stp x25,x26,[sp,'
gcc_arm64.S:48: Error: no such instruction: `stp x27,x28,[sp,'
gcc_arm64.S:52: Error: too many memory references for `mov'
gcc_arm64.S:53: Error: too many memory references for `mov'
gcc_arm64.S:54: Error: too many memory references for `mov'
gcc_arm64.S:56: Error: no such instruction: `blr x20'
gcc_arm64.S:57: Error: no such instruction: `blr x19'
gcc_arm64.S:59: Error: no such instruction: `ldp x27,x28,[sp,'
gcc_arm64.S:62: Error: no such instruction: `ldp x25,x26,[sp,'
gcc_arm64.S:65: Error: no such instruction: `ldp x23,x24,[sp,'
gcc_arm64.S:68: Error: no such instruction: `ldp x21,x22,[sp,'
gcc_arm64.S:71: Error: no such instruction: `ldp x19,x20,[sp,'
gcc_arm64.S:74: Error: no such instruction: `ldp x29,x30,[sp],'
make: *** [Makefile:91: build] Error 1

@klueska
Copy link
Contributor

klueska commented Apr 24, 2024

Can you provide a minimal reproducer that we can run ourselves? Without the ability to reproduce this ourselves (or at least see the exact full of code being compiled) we are not going to be able to help much.

@asm582
Copy link

asm582 commented Apr 25, 2024

Thanks for all your help, the project builds and I can deploy the container. On the Ubuntu machine, I got away with all the compile flags that were added earlier, the build step in the make file is as below, which is also provided by the kubebuilder scaffolding logic:

.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
	go build -o bin/manager cmd/main.go

To build the container image, I used the dockerfile from the DRA repo with modifications added from the kubebuilder scaffolding :

ARG GOLANG_VERSION=1.22.2

FROM nvidia/cuda:12.4.1-base-ubuntu22.04 as build

RUN apt-get update && \
    apt-get install -y wget make git gcc \
    && \
    rm -rf /var/lib/apt/lists/*

#TODO: Remove arch discovery
RUN set -eux; \
    \
    arch="$(uname -m)"; \
    case "${arch##*-}" in \
        x86_64 | amd64) ARCH='amd64' ;; \
        ppc64el | ppc64le) ARCH='ppc64le' ;; \
        aarch64) ARCH='arm64' ;; \
        *) echo "unsupported architecture" ; exit 1 ;; \
    esac; \
       wget -nv -O - https://storage.googleapis.com/golang/go1.22.2.linux-amd64.tar.gz \
    | tar -C /usr/local -xz

ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH

WORKDIR /workspace

# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum

RUN go mod download

# Copy the go source
COPY cmd/main.go cmd/main.go
COPY api/ api/
COPY internal/controller/ internal/controller/

RUN go build -o bin/manager cmd/main.go

FROM nvidia/cuda:12.4.1-base-ubuntu22.04

# Remove CUDA libs(compat etc) in favor of libs installed by the NVIDIA driver
RUN rm -f cuda-*.deb
RUN apt-get --purge -y autoremove cuda-*

ENV NVIDIA_DISABLE_REQUIRE="true"
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

WORKDIR /

COPY --from=build /workspace/bin/manager .

# Install / upgrade packages here that are required to resolve CVEs
ARG CVE_UPDATES
RUN if [ -n "${CVE_UPDATES}" ]; then \
        rm -f /etc/apt/sources.list.d/cuda.list && \
        apt-get update && apt-get upgrade -y ${CVE_UPDATES} && \
        rm -rf /var/lib/apt/lists/*; \
    fi

ENTRYPOINT ["/manager"]

I hope someone finds this helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants