Install CUDA ToolKit

The first step in our process is to install the CUDA ToolKit, which is what gives us the ability to run against the the GPU CUDA cores. Because TensorFlow is very version specific, you'll have to go to the CUDA ToolKit Archive to download the version that works with TF, instead of the standard downloads page which only has the current version.

At the time of this article, the correct version of the CUDA ToolKit is 8.0 GA2 (Feb. 2017 Release Date). When in doubt, check the TensorFlow Documentation Page for additional version information.

Once you've got the CUDA ToolKit, begin the installation. Be aware that the ToolKit contains more software than just the CUDA drivers. For example, it also contains the standard NVIDIA, PhysX, and 3D drivers, plus the GeForce Experience software, etc. Personally, I maintain all of that software separately, so I deselect it during the installation process.

Install cuDNN

Next, we'll need to grab the cuDNN (CUDA Deep Neural Network) Library) from the NVIDIA site. This software provides the GPU-accelerated library functions needed by TensorFlow for building Neural Networks. As a side note, you'll have to create an account to download this software.

Again, TensorFlow is very version specific sensitive, so at the time of this article, the correct version is cuDNN 6.0 for CUDA 8.0. When you click the Download button on the cuDNN page, select that version from the list. Do not select the most recent version!

This software just comes in a zip file, which makes the installation both simpler and complex. Simpler, because you can just pick wherever you want the software to go, but more complex because if it goes somewhere outside the PATH environment variable, you'll need to include it in the PATH.

Typically, I place the cuDNN directory adjacent to the CUDA directory inside the NVIDIA GPU Computing Toolkit directory (C:\Program Files\NVIDIA GPU Computing Toolkit\cudnn_8.0_6.0). This makes it easy to swap out the cuDNN software or the CUDA software as needed, but it does require you to add the cuDNN directory to the PATH environment variable.

Optionally, you can also unzip the cuDNN software directly into the CUDA directory (default location: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0). This has the advantage of not having to change the PATH variable, but it does co-mingle the CUDA and cuDNN software into a single environment, making upgrades more difficult in the future.

Environment Variables

Once you've got the CUDA and cuDNN software installed, you'll want to check the environment variables to make everything is in order. During the installation of the CUDA ToolKit, it should have added several new environment variables (such as CUDA_PATH), as well as adding the CUDA path to the PATH variable. For the cuDNN software, if you've placed it in the same directory as the CUDA software directory, then no additional changes are required. If you've placed it adjacent to the CUDA directory, then you'll need to add the cuDNN path to the PATH variable.

There are plenty of videos out there on how to edit environment variables, so I won't explain it here. Please check out some videos on YouTube for directions.

Install Anaconda Python

Since TensorFlow requires Python, it seems wise to use a distribution like Anaconda, which includes a package management system called conda.

Download and install the Python 3.6 version from Anaconda.

Create Conda Environment

Now comes the fun part. We have an installation of Anaconda Python, but we'll want to create an environment that is specific to TensorFlow. Open up a terminal window (personally, I quite like ConEmu) and enter the following command:

conda create -n tensorflow python=3.6

Install TensorFlow-GPU

Next up, we'll want to activate that tensorflow Python environment so we can add the TensorFlow software using the pip installer.

activate tensorflow
pip install --ignore-installed --upgrade tensorflow-gpu

Test TensorFlow-GPU

Congratulations! You've got TensorFlow installed! Probably! Wait, what?!

TensorFlow can be a bit tricky, so we're going to have to test the installation to see if everything went fine. Fortunately, someone has created an easy to use Python script that does just that!

Head on over to mrry's GitHub account and download the tensorflow_self_check.py. Once it's downloaded, run the following command in the terminal window where you've activated the tensorflow environment:

python tensorflow_self_check.py

If everything goes well and your installation was successful, you'll see this message:

TensorFlow successfully installed.
The installed version of TensorFlow includes GPU support.

Huzzah! Okay, now let's get down to business and run some code. Take the code snippet below and copy it into a file named tensorflow_test.py.

import sys
import numpy as np
import tensorflow as tf
from datetime import datetime

### argv[1] = type of device and which one
### argv[2] = size of the matrix to operate on

device_name = sys.argv[1]
shape = (int(sys.argv[2]), int(sys.argv[2]))
if device_name == "gpu":
    device_name = "/gpu:0"
else:
    device_name = "/cpu:0"

with tf.device(device_name):
    random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1)
    dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
    sum_operation = tf.reduce_sum(dot_operation)

startTime = datetime.now()
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
        result = session.run(sum_operation)
        print(result)

### Print the shape, device name and timing
print("\n" * 3)
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)
print("\n" * 3)

Now run that code (in the terminal window where you previously activated the tensorflow Anaconda Python environment):

python tensorflow_test.py gpu 10000

You'll get a lot of output, but at the bottom, if everything went well, you should have some lines that look like this:

Shape: (10000, 10000) Device: /gpu:0
Time taken: 0:00:01.933932

For comparison, you can change the execution of the tensorflow_test.py script to use your CPU, which should be several times slower:

python tensorflow_test.py cpu 10000
...
Shape: (10000, 10000) Device: /cpu:0
Time taken: 0:00:12.856383

Install Jupyter

The finish line is almost in sight! Next up is to install Jupyter. You might say 'This is unnecessary! I've already got Jupyter in my Anaconda installation!' ... well, you'd be wrong. You need an installation of Jupyter inside your Anaconda TensorFlow environment in order to work properly. If you don't perform this step and attempt to use the veresion of Jupyter that was installed with base Anaconda, you'll get an error when you attempt to import TensorFlow in the Python script.

In the same terminal window in which you activated the tensorflow Python environment, run the following command:

pip install --ignore-installed --upgrade jupyter

Test TensorFlow-GPU on Jupyter

And finally, we test using the Jupyter Notebook ... In the same terminal window in which you activated the tensorflow Python environment, run the following command:

jupyter notebook

A browser window should now have opened up. Click the New button on the right hand side of the screen and select Python 3 from the drop down. You have just created a new Jupyter Notebook.

We'll use the same bit of code to test Jupyter/TensorFlow-GPU that we used on the commandline (mostly). Take the following snippet of code, and copy it into textbox (aka cell) on the page and then press Shift-Enter.

import sys
import numpy as np
import tensorflow as tf
from datetime import datetime

device_name="/gpu:0"

shape=(int(10000),int(10000))

with tf.device(device_name):
    random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1)
    dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
    sum_operation = tf.reduce_sum(dot_operation)

startTime = datetime.now()
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
        result = session.run(sum_operation)
        print(result)

print("\n" * 2)
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)

print("\n" * 2)

The output should be something like this:

Shape: (10000, 10000) Device: /gpu:0
Time taken: 0:00:01.678044

Congratulations, you've got a working installation of TensorFlow with GPU on Windows 10 and running in a Jupyter Notebook.

How to run TensorFlow with GPU on Windows 10 in a Jupyter Notebook

Install CUDA ToolKit

Install cuDNN

Environment Variables

Install Anaconda Python

Create Conda Environment

Install TensorFlow-GPU

Test TensorFlow-GPU

Install Jupyter

Test TensorFlow-GPU on Jupyter

About James Conner

Transpose data with Spark

Connect Jupyter to Remote Spark Clusters With Apache Toree

Controlling NVIDIA GPU Fans on a headless Ubuntu system

Google's 'Coral' Edge TPU Dev Boards

Google's 'Coral' Edge TPU Accelerator

How to run TensorFlow with GPU on Windows 10 in a Jupyter Notebook

Install CUDA ToolKit

Install cuDNN

Environment Variables

Install Anaconda Python

Create Conda Environment

Install TensorFlow-GPU

Test TensorFlow-GPU

Install Jupyter

Test TensorFlow-GPU on Jupyter

About James Conner

Subscribe to My Areas of Expertise

Transpose data with Spark

Connect Jupyter to Remote Spark Clusters With Apache Toree

Controlling NVIDIA GPU Fans on a headless Ubuntu system

Google's 'Coral' Edge TPU Dev Boards

Google's 'Coral' Edge TPU Accelerator