#

TensorFlow on Odyssey

Introduction

This page is intended to help you access or setup TensorFlow on Odyssey Cluster.  

GPU support 

At time of writing the latest release of TensorFlow is 1.10.0.  Tensorflow 1.10  requires the runtime cuda 9.0 and the cudnn version 7.0. 

Also, at the time of writing the latest version available as wheel with pip is for python3.6. Support for Python 3.7 was not yet added.

The current CUDA runtime for GPU-enabled nodes on the cluster  is 9.2. So TF 1.10 should work on all GPU nodes. Please refer to our documentation on how to submit run GPU jobs  on the cluster.

The two recommended solutions for setting up TensorFlow are to install the latest version in a python conda environment inside your user folder, or run Tensorflow as singularity container.

Install in a conda environment 

At time of writing the latest release of TensorFlow is 1.10.0, and it requires the runtime cuda 9.0 and the cudnn version 7.0 . You can install your own version following these simple steps.  

#1. load Anaconda, cuda and cudnn 
>$ module load Anaconda3/5.0.1-fasrc01
>$ module load cuda/9.0-fasrc02 cudnn/7.0_cuda9.0-fasrc01

#2. create a new environment with the latest python3 and some dependencies needed by TensorFlow 
>$ conda create -n tf1.10_cuda9 python=3.6 numpy six wheel

#3. activate the conda environment
>$ source activate tf1.10_cuda9

#4. use pip to install tensorflow
(tf1.10_cuda9)>$ pip install --upgrade tensorflow-gpu 

Please note that while you can run the installation on the login nodes,  you will not be able to use the software on the login nodes  as there is no GPU on the login servers.

Running Tensorflow as a singularity container 

Alternatively, you can run TensorFlow as a  singularity container.

>$ singularity exec  --nv docker://tensorflow/tensorflow:latest-gpu python myCNN.py

 

CPU version 

Install in a conda environment 

At time of writing the latest release of TensorFlow is 1.10.0, You can install your own version following these simple steps.  

#1. load Anaconda, cuda and cudnn 
>$ module load Anaconda3/5.0.1-fasrc01

#2. create a new environment with the latest python3 and some dependencies needed by TensorFlow 
>$ conda create -n tf1.10 python=3.6 numpy six wheel

#3. activate the conda environment
>$ source activate tf1.10

#4. use pip to install tensorflow
(tf1.10)>$ pip install --upgrade tensorflow

TensorFlow optimized for Intel Hardware (only available for partitions "shared" and "test") 

If you would like to work with the CPU version of Tensorflow, you should also consider trying to use the version provided by Intel . Please note that this version is only working on Intel hardware, so you will be able to run on partitions "shared", "test" or any other Intel based priority partition  your lab might have access to. 

Please note that the code will not run in the "general" partition or on the login nodes, as those servers feature AMD processors.

At time of writing the latest version of TensorFlow released as wheel by Intel is  1.9.0.

However versions more recent than 1.6 seem to have conflicts with versions of the c/c++ library we currently  have installed by default on odyssey.  You can setup version 1.6 in a conda environment, following the steps below, or you can use versions more recent than that as a singularity container.   

Install in a conda environment

#1. load Anaconda
>$ module load Anaconda3/5.0.1-fasrc01

#2. create a new environment with the latest python3 and some dependencies needed by TensorFlow from the Intel channel
>$conda create -n tf1.6_intel -c intel python=3.6 pip numpy

#3. activate the conda environment
>$ source activate tf1.6_intel

#4. use pip to install tensorflow
(tf1.6_intel)>$pip install --no-cache-dir https://anaconda.org/intel/tensorflow/1.6.0/download/tensorflow-1.6.0-cp36-cp36m-linux_x86_64.whl

Running Tensorflow as a singularity container 

Alternatively, you can run TensorFlow as a  singularity container.

>$ singularity exec docker://tensorflow/tensorflow:latest  python myCNN.py

Or, if you want to use the Intel version 

>$ singularity exec docker://intelaipg/intel-optimized-tensorflow:latest-py3 python myCNN.py

 

... Discussion on optimization considerations coming soon...

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at Attribution.