Skip to content

Software Installation with Conda

Conda

For the management of the bioinformatics software on the BIH cluster we are using conda. Conda is a package management system that is based on channels, and one of those channels provides a huge selection of bioinformatics software.

Conda is written in Python and is based on recipes, such that everybody can write recipes for missing software (if there is any). In general the packages are pre-compiled and conda just downloads the binaries from the conda servers.

You are in charge of managing your own software stack, but conda makes it easy to do so. We will provide you with a description on how to install conda and how to use it. Of course there are many online resources that you can also use. Please find a list at the end of the document.

Also note that some system-level software is managed through environment modules. See System-near Software Provided by HPC Administration below.

Premise

When you logged into the cluster, please make sure that you also executed srun to log into a computation node and perform the software installation there.

Installing conda

res-login-1:~$ srun --mem=5G --pty bash -i
med0127:~$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
med0127:~$ bash Miniconda3-latest-Linux-x86_64.sh -b -f -p $HOME/work/miniconda

This will install conda to $HOME/work/miniconda. This path can be changed to your liking. Please note that the $HOME folder has limited space (an exception is the subfolder $HOME/work which has no space limit).

NB: $HOME/scratch is not appropriate as files placed there will be removed automatically after 2 weeks.

To make it available upon login, extend and export the $PATH variable with the installation path + /bin and add it to your $HOME/.bashrc:

case "${SLURMD_NODENAME-${HOSTNAME}}" in
    login-*)
        ;;
    *)
        export PATH=$HOME/work/miniconda/condabin:$PATH
        ;;
esac

The above code makes sure that you don't have conda available on the login nodes, where you are not allowed to start any computations.

To make bioinformatics software available, we have to add the bioconda and some other channels to the conda configuration:

med0127:~$ conda config --add channels bioconda
med0127:~$ conda config --add channels default
med0127:~$ conda config --add channels conda-forge

You can also add channels to your liking.

Installing software with conda

Installing packages with conda is straight forward:

med0127:~$ conda install <package>

This will install a package into the conda root environment. We will explain environments in detail in the next section.

To search for a package, e.g. to find the correct name in conda or if it exists at all, issue the command:

med0127:~$ conda search <string>

To choose a specific version (conda will install the latest version that is compatible with the current installed Python version), you can provide the version as follows:

med0127:~$ conda install <package>=<version>

Please note that new conda installs may ship with a recently update Python version and not all packages might have been adapted. E.g., if you find out that some packages don't work after starting out/upgrading to Python 3.8, simply try to downgrade Python to 3.7 with conda install python=3.7.

Hint

As resolving the dependency tree of an installation candidate can take a lot of time in Conda, especially when you are installing software from an environment.yaml file, an alternative resolver has been presented that you can use to install software into your Conda environment. The time savings are immense and an installation that took more than an hour can be resolved in seconds.

Simply run

med0127:~$ conda install mamba

With that, you can install software into your environment using the same syntax as for Conda:

med0127:~$ mamba install <package>

Creating an environment

Conda lets you create environments, such that you can test things in a different environment or group your software. Another common use case is to have different environments for the different Python versions. Since conda is Python-based, conflicting packages will mostly struggle with the Python version.

By default, conda will install packages into its root environment. Please note that software that does not depend on Python and is installed in the root environment, is is available in all other environments.

To create a Python 2.7 environment and activate it, issue the following commands:

med0127:~$ conda create -n py27 python=2.7
med0127:~$ source activate py27
(py27) med0127:~$

From now on, conda will install packages into the py27 environment when you issue the install command. To switch back to the root environment, simply deactivate the py27 environment:

(py27) med0127:~$ source deactivate py27
med0127:~$

But of course, as Python 2.7 is not supported any more by the Python Software Foundation, you should switch over to Python 3 already!


Last update: November 25, 2022