System setup for deep learning on Ubuntu 16.04.3 with single NVIDIA Geforce GTX 1080.
Installing NVIDIA graphics driver is usually a headache! However you could just skip it! (not always)
Actually you don't need to install a NVIDIA graphics driver as CUDA Toolkit comes with pre-packaged GPU drivers. The NVIDIA graphics card driver will be installed together with CUDA as a by-product. However you need to install from the CUDA .run file and choose whether or not you want to install driver as well.
However, you need to make sure that the CUDA driver bundled with CUDA toolkit does support your graphics card. It is still recommended to install the latest standalone driver separately as the driver bundled with CUDA is usually out-of-date.
There are three ways to install NVIDIA proprietary drivers.
- Download from official NVIDIA website and follow their installation instructions. (risky and may need manual tweeking)
- Using a PPA repository Install Nvidia Drivers from PPA
- (Recommended) Using
Additional DriversunderSoftware & Updates. In your Ubuntu press theWinkey and searchAdditional Driverand select for the recommended NVIDIA proprietary driver from official Ubuntu package repository.
When the driver is loaded, the driver version can be found by executing the command:
$ cat /proc/driver/nvidia/versionRun the NVIDIA system manager interface tool to query the status of all your GPUs:
$ nvidia-smior use the following command to watch the results in real-time:
$ sudo watch nvidia-smiYou can check the status of driver modules loaded in the Linux Kernel:
$ lsmod | grep nvidia # there should be output
$ lsmod | grep nouveau # there should be no outputGo to NVIDIA website and download the CUDA Debian Installer (.deb).
$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cudaIn order to automatically set all the environment variables every time you open your terminal, put the following exports into your ~/.bashrc file.
export CUDA_HOME=/usr/local/cuda-8.0
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATHIn order for the above to take effect, source your ~/.bashrc or re-open your terminal.
$ source ~/.bashrcThe version of the CUDA Toolkit can be checked by running:
$ nvcc -VVerify CUDA installation by running the examples.
$ cuda-install-samples-8.0.sh ~
$ cd NVIDIA_CUDA-8.0_Samples/
$ makeNOTE: If compiler error cannot find -lnvcuvid, or skipped incompatible libnvcuvid.so need to change the settings to be consistent with the nvidia driver you are using (e.g. nvidia-375). A lazy fix would be to run:
find . -type f -execdir sed -i 's/UBUNTU_PKG_NAME = "nvidia-367"/UBUNTU_PKG_NAME = "nvidia-375"/g' '{}' \;Run the examples:
$ cd bin/x86_64/linux/release/
$ ./deviceQueryAt the end of printout, you will see:
Result = PASSDownload respective packages from NVIDIA website as below and follow the corresponding instructions. You need to sign up an account with NVIDIA to download the files.
- Install from tarball (recommended)
$ tar -xzvf cudnn-8.0-linux-x64-v7.tgzCopy the following files into the CUDA Toolkit directory.
$ sudo cp -P cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
$ sudo ldconfigAdding -P retains the symbolic links, and avoids the error: /sbin/ldconfig.real: /usr/local/cuda/lib64/libcudnn.so.7 is not a symbolic link.
NOTE: If you install cuDNN from tarball, you need to add the destination directores (e.g. /usr/local/cuda/lib64) into $LD_LIBRARY_PATH in order for other packages to look for the libraries. Doing so by adding the following line into your ~/.bashrc.
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH- Install from .deb
cuDNN v7.0
$ sudo dpkg -i libcudnn7_7.0.1.13-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn7-dev_7.0.1.13-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn7-doc_7.0.1.13-1+cuda8.0_amd64.deb cuDNN v6.0
$ sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn6-doc_6.0.21-1+cuda8.0_amd64.deb- Verify cuDNN
To verify that cuDNN is installed and is running properly, compile the mnistCUDNN sample located in the /usr/src/cudnn_samples_v7 directory in the debian file. (Note: The samples codes are a separate .tgz download or installed with cuDNN deb file)
Copy the cuDNN sample to a writable path:
$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNNIf cuDNN is properly installed and running on your Linux system, you will see a message similar to the following:
Test passed!- General dependencies
$ sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
$ sudo apt-get install --no-install-recommends libboost-all-dev
$ sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev- CUDA
Refer to Install CUDA toolkit. Note: CUDA 8 is required on Ubuntu 16.04.
- BLAS
Choose one of the followings
$ sudo apt-get install libatlas-base-devor
$ sudo apt-get install libopenblas-dev- Python
Install pip, Python development package, and numpy
$ sudo apt-get install python-pip python-dev python-numpyand install the following dependency packages to avoid ImportError: No module named skimage.io and google.protobuf.internal
$ pip install scikit-image
$ pip install protobuf- Complile Caffe using CMake (Recommended)
If you don't have git and CMake in your system, install them first
$ sudo apt install git cmakeDownload Caffe and compile
$ git clone https://github.com/BVLC/caffe.git
$ cd caffe
$ mkdir build && cd build
$ cmake -DCUDA_USE_STATIC_CUDA_RUNTIME=OFF ..
$ make all -j8
$ make runtest -j8NOTE: CUDA_USE_STATIC_CUDA_RUNTIME (Default ON)
-- When enabled the static version of the CUDA runtime library will be used
in CUDA_LIBRARIES.
On 16.04, aarch64 has issues with a static cuda runtime. So we need to disable CUDA_USE_STATIC_CUDA_RUNTIME.
In order for python to find caffe module, you need to set the env variable PYTHONPATH by adding the following line into your ~/.bashrc. Otherwise, ImportError: No module named caffe.
export PYTHONPATH=$HOME/caffe/python:$PYTHONPATH- Compile Caffe using Makefile.config (make)
If you are using Makefile.config and make, you need to add the hdf5 include directory
$ cp Makefile.config.example Makefile.config
$ echo "INCLUDE_DIRS += /usr/include/hdf5/serial/" >> Makefile.config
$ echo "LIBRARY_DIRS += /usr/lib/x86_64-linux-gnu/hdf5/serial/" >> Makefile.configand uncomment the USE_CUDNN := 1 to build with cuDNN acceleration, choose which BLAS you are using either atlas or openblas, etc.
$ make all -j8
$ make test -j8
# (Optional)
$ make runtest -j8- Test AlexNet
In /build, run
$ tools/caffe time --model=../models/bvlc_alexnet/deploy.prototxt --gpu=0Please refer to the complete and up-to-date install instructions from TensorFlow website.
- Install dependencies
$ sudo apt-get install libcupti-devThis requires installing libcudnn7, libcudnn7-dev, and libcudnn7-doc in certain directories. The headers must be located at:
/usr/include/cudnn.hand libraries in:
/usr/lib/x86_64-linux-gnu/libcudnn_static.a
/usr/lib/x86_64-linux-gnu/libcudnn.so.7
/usr/lib/x86_64-linux-gnu/libcudnn.so
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.0.1
/usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a/usr/lib/x86_64-linux-gnu/libcudnn.so.6
/usr/lib/x86_64-linux-gnu/libcudnn_static_v6.a
/usr/lib/x86_64-linux-gnu/libcudnn.so.6.0.21
/usr/lib/x86_64-linux-gnu/libcudnn_static.aRefer to section Install cuDNN and Install from .deb will setup everything.
- Install TensorFlow via virtualenv for Python 2.7
Install and activate virtualenv:
$ sudo apt-get install python-pip python-dev python-virtualenv
$ virtualenv --system-site-packages ~/tensorflowActivate TensorFlow virtualenv environment:
$ source ~/tensorflow/bin/activateNOTE: Note that you must activate the virtualenv environment each time you use TensorFlow. Using the following command to deactivate TensorFlow environment when you are done with it.
(tensorflow)$ deactivate Install TensorFlow with GPU support in the virtual environment:
(tensorflow)$ pip install --upgrade tensorflow-gpu- Verify the installation Activate the virtualenv and run the following python code:
(tensorflow)$ python
(tensorflow)# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))The code prints "Hello, TensorFlow!"
- Uninstall TensorFlow
$ rm -r ~/tensorflow- Common errors
ImportError: libcudnn.Version: cannot open shared object file:
No such file or directoryInstall the exact Version of libcudnn. TensorFlow may not support the latest versions of cuDNN.