nvidia_ioctl frequent dynamic memory allocation #627

NeilZhy · 2024-04-23T02:12:01Z

NeilZhy
Apr 23, 2024

I used the bcc tool stackcount to view the algorithm module's dynamic physical memory allocation function __alloc_pages_nodemask, and found that nvidia_ioctl opened up a large amount of dynamic memory. I want to optimize dynamic memory allocation, but I have tried various methods and cannot get the complete call stack. I can only view the flame graph through the bcc tool and get the following results. Here I can only see the kernel mode function call stack, but I cannot I found the user-mode function call stack. I tried to optimize cudaHostAlloc, cudaMallocHost and other functions, but still couldn't optimize it. Which user-level function may be causing dynamic memory? How can I optimize it?

mtijanic · 2024-04-24T21:22:32Z

mtijanic
Apr 24, 2024
Maintainer

As mentioned in another thread, there is unfortunately many ways UMDs can enter any give ioctl, so you really need the userside stack to make use of it.

The reason you're not getting the user stack is because the BPF tools require the usermode app (and libc) to be compiled with frame pointers. One other option you could try is to just trigger a breakpoint in the ioctl() function when a given filter is hit. As a simple example, something like this should work:

#define _GNU_SOURCE 1
#include <dlfcn.h>
#include <stdio.h>

int ioctl(int fd, unsigned long request, void* ptr)
{
    static int(*original)(int, unsigned long, void* ptr);
    if (!original)
        original = dlsym(RTLD_NEXT, __FUNCTION__);

    if ((request & 0xff) == 0x4a) // NV_ESC_RM_VID_HEAP_CONTROL
        asm ("int $3");

    printf("ioctl: 0x%lx\n", request);
    return original(fd, request, ptr);
}

Compile as:

gcc -fPIC -shared -o ioctl-profiler.so ioctl-profiler.c

and then run as:

gdb -ex run --args env LD_PRELOAD=./ioctl-profiler.so glxgears

you should hit a bp in gdb and then you can dump the stack like:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7fc3184 in ioctl () from ./ioctl-profiler.so
(gdb) bt
#0  0x00007ffff7fc3184 in ioctl () from ./ioctl-profiler.so
#1  0x00007ffff642aa49 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#2  0x00007ffff642c430 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#3  0x00007ffff642f28d in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#4  0x00007ffff5f0ce69 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#5  0x00007ffff5f0d21c in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#6  0x00007ffff5f0d33c in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.530.41.03
#7  0x00007ffff787d0b6 in ?? () from /lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#8  0x00007ffff787d1d8 in ?? () from /lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#9  0x00007ffff78a7a5e in ?? () from /lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#10 0x00007ffff789a154 in glXCreateContext () from /lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#11 0x00007ffff79afc97 in glXCreateContext () from /lib/x86_64-linux-gnu/libGLX.so.0
#12 0x000055555555816f in ?? ()
#13 0x000055555555657f in ?? ()
#14 0x00007ffff7abd083 in __libc_start_main (main=0x555555556410, argc=1, argv=0x7fffffffd748, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd738) at ../csu/libc-start.c:308
#15 0x0000555555556f0a in ?? ()

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvidia_ioctl frequent dynamic memory allocation #627

{{title}}

Replies: 1 comment

{{title}}

Select a reply

nvidia_ioctl frequent dynamic memory allocation #627

NeilZhy Apr 23, 2024

Replies: 1 comment

mtijanic Apr 24, 2024 Maintainer

NeilZhy
Apr 23, 2024

mtijanic
Apr 24, 2024
Maintainer