Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA performance of training code, Chapter 8, much lower than expected #32

Open
Transigent opened this issue Feb 17, 2019 · 2 comments
Open
Assignees

Comments

@Transigent
Copy link

Transigent commented Feb 17, 2019

Hello

Firstly thanks for the great book, I have learned a great deal from it. This is my first foray into the world of machine learning and I am having a ball.

I am attempting to get the stock trading sample running from Chapter 8. Initially it seems to run but I am finding that the performance with CUDA enabled is much worse than I had expected.

My system specs are as follows:
X6 Phenom II 1055t
GTX 980 4GB watercooled
24GB RAM
Windows 10 64

Software:
Anaconda 3
Python 3.6.7
PyCharm 2018.3

An initial pass through train_model.py without CUDA enabled yields the following performance:

pytorch-4-1-cpu

Note the GPU is idle as expected, at about 1% utilization.
pytorch-4-1-cpu-temp

Task Manager shows CPU at 100% as expected, GPU at 0%
pytorch-4-1-cpu-taskman

After enabling CUDA with --cuda switch, performance is only marginally better. Note that the GPU is at about 2% load according to Task Manager, or 9% load according to GPU temp, and temperature has risen a whole 1-2 degrees C on average (if I run FurMark the GPU is at 99% and the temperature quickly rises from 33 to maybe 48 Deg C, on water cooling). It's fast up until the buffer is populated and training starts, then afterwards it takes about 15 seconds to spit out one line, or 100 epochs.

pytorch-4-1-gpu

GPU is barely doing anything,
pytorch-4-1-gpu-temp

Task Manager says GPU is 2% utilized. CPU has a few peaks but averages about 30%

pytorch-4-1-gpu-taskman

It appears that the mode is changing from CPU to GPU since CPU goes down and GPU goes up. But it seems that the GPU only improved the outcome by maybe 60%, which seems almost negligible for a GPU like a GTX 980, and has 2048 Shader, 128 Texture, 64 ROPs. And clearly at 2% utilization the GPU is not doing a great deal of acceleration.

I assume I must have done something wrong. I have been trying different variations of packages and settings for a few days, including:

  • Trying different Python versions, 3.6, 3.7 (some minor versions would not play with PyTorch at all, eg. 3.6.8)
  • Installing the CUDA 9 Windows installer from NVidia as a system wide installation
  • Trying different video drivers, the latest two, and also one that was installed with the CUDA installer which was dated 2017
  • Trying a bunch of different package versions of various things, cudatoolkit, PyTorch etc.
  • Also I did some tests to ensure that CUDA was working
import torch
torch.cuda.current_device()
Out[3]: 0
torch.cuda.device(0)
Out[4]: <torch.cuda.device at 0x153b39c5780>
torch.cuda.device_count()
Out[5]: 1
torch.cuda.get_device_name(0)
Out[6]: 'GeForce GTX 980'

My current package configuration is as follows;
` Name Version Build Channel

anaconda-client 1.7.2 py36_0
anaconda-navigator 1.9.6 py36_0
anaconda-project 0.8.2 py36_0
cuda90 1.0 0 pytorch
cudatoolkit 9.0 1
cudnn 7.3.1 cuda9.0_0
gym 0.11.0 pypi_0 pypi
matplotlib 3.0.2 py36hc8f65d3_0
numpy 1.15.4 py36h19fb1c0_0
opencv-python 4.0.0.21 pypi_0 pypi
pip 19.0.1 py36_0
ptan 0.3 pypi_0 pypi
python 3.6.7 h9f7ef89_2
pytorch 0.4.1 py36_cuda90_cudnn7he774522_1 pytorch
scipy 1.2.0 py36h29ff71c_0
tensorboard 1.12.2 py36h33f27b4_0
tensorboardx 1.6 pypi_0 pypi
tensorflow 1.12.0 gpu_py36ha5f9131_0
tensorflow-base 1.12.0 gpu_py36h6e53903_0
tensorflow-gpu 1.12.0 pypi_0 pypi
torchvision 0.2.1 py_2 pytorch`

Hmm that borked my formatting, heres an image

package-versions

Any thoughts about what I might have done wrong here would be much appreciated. I'm still just getting a handle on Python and Deep RL.

Thanks for your time!

Chris

@icompute386
Copy link

Hi,
I'm seeing the same issue, running on an RTX 2080 Ti. I notice the video memory usage goes up with CUDA, enabled, though marginally, and that on the most part the GPU is idle.

CPU usage goes up too, though not to 100%, which occurs when not enabling CUDA.

image

@lamhk
Copy link

lamhk commented Apr 18, 2019

Hi I just want to provide my observation after running through the training for Chapter08. My platform is running on 1) Ubuntu 16.04; 2) i5-8400 with GTX-1060-6G with 16G RAM. Below are running train_model_conv.py with around 150 fps and the GPU utilization is around 85-88% (image attached).

1100271: done 77900 games, mean reward -0.202, mean steps 39.56, speed 91.32 f/s, eps 0.10
1106617: done 78000 games, mean reward -0.202, mean steps 39.97, speed 149.52 f/s, eps 0.10
1112197: done 78100 games, mean reward -0.202, mean steps 40.32, speed 149.28 f/s, eps 0.10
1118498: done 78200 games, mean reward -0.202, mean steps 40.76, speed 149.43 f/s, eps 0.10
1123793: done 78300 games, mean reward -0.202, mean steps 41.08, speed 149.51 f/s, eps 0.10
1129957: done 78400 games, mean reward -0.203, mean steps 41.47, speed 149.41 f/s, eps 0.10
1136721: done 78500 games, mean reward -0.203, mean steps 41.94, speed 149.29 f/s, eps 0.10
1142939: done 78600 games, mean reward -0.203, mean steps 42.34, speed 149.45 f/s, eps 0.10
1148525: done 78700 games, mean reward -0.203, mean steps 42.71, speed 149.35 f/s, eps 0.10
1154480: done 78800 games, mean reward -0.203, mean steps 43.07, speed 149.40 f/s, eps 0.10

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants