Containers Have The Potential to Take Scientific HPC to the Cloud

This post is the next in our series of guest blog articles in which experts from academic and industry provide their unique perspectives on matters relating to HPC, cloud computing and HPC in the cloud. 


    The post was submitted by Carlos Eduardo Arango Gutierrez, a PhD student in the Distributed Systems and Networks Laboratory at University of Valle.

    Carlos’ background is in computer programming, HPC, theoretical ecology, atmospheric physics modelling, and environmental engineering. He is currently working on precision agriculture projects at CIBioFi with the Monte-Carlo method in the numerical solution of the radiative transfer equation in atmospheric measurements of trace gases using the technique of differential absorption optical spectroscopy (DOAS). You can reach him at carlos.arango.gutierrez [at] and

Virtualisation has gained popularity in cloud computing in recent years. But using virtual machines (VMs) for parallel computing isn’t an attractive option for many users in the scientific computing community. A VM gives users the freedom to choose an operating system and a software stack. For the scientific community it also provides them with the ability to automate repetitive tasks and facilitates reproducibility of their research and work. VMs provide a certain degree of assurance to researchers that their work will run on every single machine that a supervisor, reviewer or collaborator will test. Thus, it avoids the common issue of “it worked on my workstation” but not on a reviewer’s or collaborator’s cluster or workstation.

Large commercial cloud computing providers, such as Amazon, Google, Microsoft and OpenShift, and open-source solutions (e.g. OpenStack) provide platforms for running a single VM or multiple VMs. However, this approach comes with drawbacks. These include but are not limited to: reduced performance when running code inside of a VM; increased complexity of cluster management, and the need to learn new tools and protocols to manage the VM clusters. And there is also the issue of facing the cloud provider’s bill at the end of the month!

Recently, a new technology – Containers – has made its debut and entered the distributed systems space. Developed in Linux, Containers make use of Linux Namespaces to manage resource isolation for single processes, and the Linux cgroups to manage resources for a group of processes. Container technologies allow isolation and usage of system resources, such as CPU and memory, for a group of processes.


Image credit:

The implementation of containers has been warmly received by the enterprise and web development sectors due to the ease with which applications can be developed, distributed and deployed. There is a vast number of companies active in the Linux container ecosystem, including some of the larger above-mentioned cloud providers.   The two most famous of which are the projects Docker and Linux Containers (LXC). Despite the advantages offered by container technology, the implications for scientific computing, including HPC, are still unclear.

Projects such as LXC and Singularity  – which focus on the container approach to creating portable environments – now enable the migration of computational science to the cloud. Containers are proving to be an extremely valuable technology for scientific research, delivering benefits such as portability and reproducibility to scientific users. Containers can emulate a single program and can be executed directly, without the overhead that comes with running a virtual kernel and hardware. With little effort, a HPC cluster with an installed workload manager such as Slurm, HTCondor or Torque can deploy an application or code with only minor drawbacks on performance.

A container-based HPC cloud could help the CloudLightning project in its objective to build a large-scale, self-organised and self-managed hyper-scale heterogeneous cloud. The system could offer a distro and vendor neutral environment for the development of HPC applications. This would reduce development, deployment and optimisation efforts.

Leave a Reply