Python

A lot of research projects use Python and different Python packages/modules to achieve results. This page describes how to easily setup a workable python environment with your own python packages inside.

For new projects, you should in particular consider whether you want to use the 2.x or 3.x variant of Python. The two versions are not compatible and in some cases, you may have to use an older 2.7.x versions of Python due to some of your packages not working with Python 3.x.

Python distributions

At Abacus we maintain two variants of Python and several versions of each:

  • python/2.7.9
  • python/2.7.10
  • python/2.7.11 (default)
  • python/2.7.12
  • python/2.7.13
  • python/3.4.3
  • python/3.5.1
  • python/3.5.2
  • python/3.6.0
  • python-intel/2.7.10-184913
  • python-intel/2.7.11 (default)
  • python-intel/2.7.12
  • python-intel/2.7.12.35
  • python-intel/3.5.0-185146
  • python-intel/3.5.1
  • python-intel/3.5.2
  • python-intel/3.5.2.35

The vanilla Python versions (python) includes Python and a few extra packages including in particular virtualenv (see below). For further information, have a look at the official Python home page

The Intel optimised version of Python (python-intel) has been compiled by Intel and includes a lot of widely used python packages including numpy, scipy, pandas, matplotlib, virtualenv, etc. for more information look at the official Intel Python home page:

To use a particular version of python simply use module add:

testuser@fe1:~$ module add python-intel/3.5.2.35

Adding extra packages

In many cases you'll need extra python packages for your project. In the following we describe two ways to do this. You should consider both of them and use the one most suitable way for your project.

As noted above, also consider using one of the python-intel variants as this already contains many packages, including in particular maybe some of the packages you need.

Adding extra packages #1 - using pip --user

In the simple case, you only need one/a few packages, and only need this for yourself. In this case, use pip install --user to install the module your own home directory as shown below, i.e., first use module add to select the right python version, next use pip install

testuser@fe1:~$ module add python-intel/3.5.2.35
testuser@fe1:~$ pip install --user Pillow
Collecting Pillow
  Downloading Pillow-4.1.0-cp35-cp35m-manylinux1_x86_64.whl (5.7MB)
    100% |████████████████████████████████| 5.7MB 204kB/s
Collecting olefile (from Pillow)
  Downloading olefile-0.44.zip (74kB)
    100% |████████████████████████████████| 81kB 8.6MB/s
Building wheels for collected packages: olefile
  Running setup.py bdist_wheel for olefile ... done
  Stored in directory: /home/testuser/.cache/pip/wheels/20/...
Successfully built olefile
Installing collected packages: olefile, Pillow
Successfully installed Pillow-4.1.0 olefile-0.44

Files are installed in your home directory (in ~/.local/).

Things to consider:

  • The packages are only available to your own user, not to anybody else.
  • If you change the Python version selected with module add, the module may not work, and you may have to redo this.

Adding extra packages #2 - using virtualenv

virtualenv is a tool that can be used to create isolated Python environments. In each environment you select the Python version and Python packages needed for you project. If you keep old virtualenv environments, it is possible to later redo some of the job scripts in the exact same Python environment as when you ran the script the first time.

Creating the environment

The Python files need to be placed in a directory. In the following examples we use /work/sdutest/tensor to install our own version of Tensorflow. You should instead use a directory within one of your own project directories.

testuser@fe1:~$ module purge
testuser@fe1:~$ # tensorflow also requires the CUDA and cudnn modules
testuser@fe1:~$ module add python/3.5.2 cuda/8.0.44 cudnn/5.1
testuser@fe1:~$ virtualenv /work/sdutest/tensor-1.2
PYTHONHOME is set.  You *must* activate the virtualenv before using it
Using base prefix '/opt/sys/apps/python/3.5.2'
New python executable in /work/sdutest/tensor-1.2/bin/python3.5
Also creating executable in /work/sdutest/tensor-1.2/bin/python
Installing setuptools, pip, wheel...done.
testuser@fe1:~$ source /work/sdutest/tensor-1.2/bin/activate
(tensor-1.2) testuser@fe1:~$ # you are now inside your own Python environment

Note the line with source /work/sdutest/tensor-1.2/bin/activate. You'll need to repeat this step every time before you actually use your new Python environment.

We suggest to edit the activate script to include the module purge and module add lines from above to easily setup the correct environment every time you use this. The two lines must be added to the top of the file.

testuser@fe1:~$ nano /work/sdutest/tensor-1.2/bin/activate
# add module purge and module add ... lines at the top

Adding packages

After the initial package setup, you can use pip install as you would if you had installed Python yourself, e.g.,

testuser@fe1:~$ source /work/sdutest/tensor-1.2/bin/activate
(tensor-1.2) testuser@fe1:~$ which pip
/work/sdutest/tensor-1.2/bin/pip
(tensor-1.2) testuser@fe1:~$ pip3 install --upgrade tensorflow-gpu
Collecting tensorflow-gpu
  Downloading tensorflow_gpu-1.1.0-cp35-cp35m-manylinux1_x86_64.whl (84.1MB)
    100% |████████████████████████████████| 84.1MB 18kB/s
Collecting protobuf>=3.2.0 (from tensorflow-gpu)
...
Installing collected packages: protobuf, numpy, werkzeug, tensorflow-gpu
Successfully installed numpy-1.12.1 protobuf-3.3.0 tensorflow-gpu-1.1.0 werkzeug-0.12.2
(tensor-1.2) testuser@fe1:~$

Using the environment

If you added the module purge and module add ... lines as described in the first step, you simply need to source the activate script everytime before starting to use the Python environment.

testuser@fe1:~$ source /work/sdutest/tensor-1.2/bin/activate
(tensor-1.2) testuser@fe1:~$ # you are now inside your own Python environment

Similarly, in your Slurm job scripts you should add the source line as shown below:

#! /bin/bash
#SBATCH --account sdutest_gpu     # account
#SBATCH --time 2:00:00            # max time (HH:MM:SS)

echo Running on "$(hostname)"
echo Available nodes: "$SLURM_NODELIST"
echo Slurm_submit_dir: "$SLURM_SUBMIT_DIR"
echo Start time: "$(date)"

# Load the Python environment
source /work/sdutest/tensor-1.2/bin/activate

# Start your python application
python ...

echo Done.