Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuDNN library not fully used on Windows #80

Open
QueensGambit opened this issue Apr 9, 2021 · 1 comment
Open

cuDNN library not fully used on Windows #80

QueensGambit opened this issue Apr 9, 2021 · 1 comment
Labels
cuDNN Issue about cuDNN library Windows issues specific to windows users

Comments

@QueensGambit
Copy link
Owner

QueensGambit commented Apr 9, 2021

The library cudnn_cnn_infer64_8.dll is not used on Windows, but libcudnn_cnn_infer.so.8 is used on Linux.
This seems to make a visible NPS difference.

e.g. Ubuntu 18.04:

GPU: RTX 2070 OC

isready
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-1.onnx
info string deserialize engine: model/chess/model-bsize1-fp16-0.trt
info string inputDims: (1, 39, 8, 8)
info string valueOutputDims: (1, 1)
info string policyOutputDims: (1, 4864)
info string No auxiliary outputs detected.
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string inputDims: (16, 39, 8, 8)
info string valueOutputDims: (16, 1)
info string policyOutputDims: (16, 4864)
info string No auxiliary outputs detected.
readyok
go infinite
info string create new tree
info string run mcts search
info depth 17 seldepth 28 multipv 1 score cp 47 nodes 18522 nps 18485 tbhits 0 time 1002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2
info depth 19 seldepth 31 multipv 1 score cp 47 nodes 38347 nps 19154 tbhits 0 time 2002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1
info depth 19 seldepth 37 multipv 1 score cp 47 nodes 57007 nps 18990 tbhits 0 time 3002 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1

GPU-Utility: 91%

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 207...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   48C    P2   152W / 215W |    677MiB /  7982MiB |     91%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:0B:00.0  On |                  N/A |
|  0%   54C    P2    56W / 250W |    441MiB / 11177MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

e.g. Windows 10:

GPU: RTX 2070 OC

isready
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-1.onnx
info string deserialize engine: model/chess/model-bsize1-fp16-0.trt
info string inputDims: (1, 39, 8, 8)
info string valueOutputDims: (1, 1)
info string policyOutputDims: (1, 4864)
info string No auxiliary outputs detected.
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string onnx file: model/chess/model-1.23453-0.572-0537-bsize-16.onnx
info string deserialize engine: model/chess/model-bsize16-fp16-0.trt
info string inputDims: (16, 39, 8, 8)
info string valueOutputDims: (16, 1)
info string policyOutputDims: (16, 4864)
info string No auxiliary outputs detected.
readyok
go infinite
info string create new tree
info string run mcts search
info depth 17 seldepth 28 multipv 1 score cp 47 nodes 16500 nps 16369 tbhits 0 time 1008 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2
info depth 19 seldepth 31 multipv 1 score cp 47 nodes 33367 nps 16584 tbhits 0 time 2012 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1
info depth 19 seldepth 33 multipv 1 score cp 47 nodes 50400 nps 16617 tbhits 0 time 3033 pv d2d4 g8f6 c2c4 e7e6 g1f3 b7b6 g2g3 c8a6 b2b3 f8b4 c1d2 b4e7 f1g2 c7c6 d2c3 d7d5 b1d2 b8d7 e1g1

GPU-Utility: 85%

C:\Windows\System32\DriverStore\FileRepository\nv_dispui.inf_amd64_c1f8f32cc9af9677>nvidia-smi
Fri Apr  9 17:37:37 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 461.33       Driver Version: 461.33       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 207... WDDM  | 00000000:01:00.0 Off |                  N/A |
| 29%   59C    P2   141W / 215W |    845MiB /  8192MiB |     85%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108... WDDM  | 00000000:0B:00.0  On |                  N/A |
|  0%   38C    P8    17W / 250W |    692MiB / 11264MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
@QueensGambit QueensGambit added Windows issues specific to windows users cuDNN Issue about cuDNN library labels Apr 9, 2021
@QueensGambit
Copy link
Owner Author

I was able to link the binary to cudnn_cnn_infer64_8.dll but this didn't seem to help unfortunately.

Also adding certain optimization options such as /O2 (Maximize Speed), /GL (Whole program optimization), /LTCG (Link-time code generation) didn't result in a NPS improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuDNN Issue about cuDNN library Windows issues specific to windows users
Projects
None yet
Development

No branches or pull requests

1 participant