Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nvptx-run] Add --verbose/-v #27

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

vries
Copy link
Contributor

@vries vries commented Oct 13, 2020

No description provided.

vries added 2 commits October 13, 2020 10:42
Consider test.c:
...
int main (int argc, char **argv) {
  printf ("argc: %u\n", argc);
  return 0;
}
...
such that we have:
...
$ nvptx-none-run a.out
argc: 1
$ nvptx-none-run a.out bla
argc: 2
...

Given that the usage indicates that the program seperates the nvptx options
and the program arguments:
...
$ nvptx-none-run --help
Usage: nvptx-none-run [option...] program [argument...]
...
I'd expect:
...
$ nvptx-none-run a.out bla -V
argc: 3
...
but instead we get:
...
$ ./run.sh a.out bla -V
nvtpx-none-run (nvptx-tools) 1.0
<COPYRIGHT>
$
...

Fix this by calling getopt_long with optstring starting with '+'.
Add a --verbose flag to nvptx-run, such that we have:
...
$ gcc ~/hello.c
$ nvptx-none-run -v ./a.out
Total device memory: 4242604032 (3.95 GiB)
Initial free device memory: 4222156800 (3.93 GiB)
Program args reservation (effective): 1048576 (1.00 MiB)
Set stack size limit: 131072 (128.00 KiB)
Stack size limit reservation (estimated): 1342177280 (1.25 GiB)
Stack size limit reservation (effective): 1423966208 (1.32 GiB)
Free device memory: 2797142016 (2.60 GiB)
Set heap size limit: 268435456 (256.00 MiB)
hello
...
@vries
Copy link
Contributor Author

vries commented Oct 13, 2020

Note: contains "[nvptx-run] Fix greedy option parsing" to avoid merge conflict.

@vries vries changed the title Verbose 2 [nvptx-run] Add --verbose/-v Oct 13, 2020
Copy link
Member

@tschwinge tschwinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vries, thanks. I have a few questions, please have a look.

Comment on lines +289 to +291

size_t free_mem;
size_t dummy;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should dummy move inside the if (verbose)?

r = cuCtxSetLimit(CU_LIMIT_STACK_SIZE, 0);
fatal_unless_success (r, "could not set stack limit");

r = cuMemGetInfo (&free_mem, &dummy);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, doesn't dummy here (when given a better name) make obsolete the earlier cuDeviceTotalMem call?

Or, is total amount of memory available for allocation by the CUDA context vs. total amount of memory available on the device intentional?

Comment on lines +294 to +295
/* Set stack size limit to 0 to get more accurate free_mem. */
r = cuCtxSetLimit(CU_LIMIT_STACK_SIZE, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From cuCtxSetLimit: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g0651954dfb9788173e60a9af7201e65a I can't easily tell the rationale here.

So, should we add more commentary for this, or point to an external URL if that makes sense?

Comment on lines +333 to +337
size_t free_mem_update;
r = cuMemGetInfo (&free_mem_update, &dummy);
fatal_unless_success (r, "could not get free memory");
report_val (stderr, "Program args reservation (effective)",
free_mem - free_mem_update);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this difference computation implicitly assume that nothing else is using the GPU concurrently? (Which is a wrong assumption?) Or, does every process/CUDA context always have available all the GPU memory -- I don't remember the details, and have not yet looked that up.

Comment on lines +377 to +381
size_t free_mem_update;
r = cuMemGetInfo (&free_mem_update, &dummy);
fatal_unless_success (r, "could not get free memory");
report_val (stderr, "Stack size limit reservation (effective)",
free_mem - free_mem_update);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants