Replies: 1 comment
-
As mentioned in another thread, there is unfortunately many ways UMDs can enter any give ioctl, so you really need the userside stack to make use of it. The reason you're not getting the user stack is because the BPF tools require the usermode app (and libc) to be compiled with frame pointers. One other option you could try is to just trigger a breakpoint in the #define _GNU_SOURCE 1
#include <dlfcn.h>
#include <stdio.h>
int ioctl(int fd, unsigned long request, void* ptr)
{
static int(*original)(int, unsigned long, void* ptr);
if (!original)
original = dlsym(RTLD_NEXT, __FUNCTION__);
if ((request & 0xff) == 0x4a) // NV_ESC_RM_VID_HEAP_CONTROL
asm ("int $3");
printf("ioctl: 0x%lx\n", request);
return original(fd, request, ptr);
} Compile as:
and then run as:
you should hit a bp in gdb and then you can dump the stack like:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I used the bcc tool stackcount to view the algorithm module's dynamic physical memory allocation function __alloc_pages_nodemask, and found that nvidia_ioctl opened up a large amount of dynamic memory. I want to optimize dynamic memory allocation, but I have tried various methods and cannot get the complete call stack. I can only view the flame graph through the bcc tool and get the following results. Here I can only see the kernel mode function call stack, but I cannot I found the user-mode function call stack. I tried to optimize cudaHostAlloc, cudaMallocHost and other functions, but still couldn't optimize it. Which user-level function may be causing dynamic memory? How can I optimize it?
Beta Was this translation helpful? Give feedback.
All reactions