Compute Environment

Compute environment management is critical to any project that executes on more than one server or computer. This is true of most projects, where local development is performed on a local desktop or laptop, but the simulations must be executed on a remote server or computer cluster with high performance computing (HPC). Because modsim developers may not have administrative or root permissions on the remote servers where their simulations are executed, virtual environments are used to provide consistency between servers. Some types of virtual environments may also provide cross-platform environment consistency.

There are two major components relevant to the compute environment discussion for modsim workflows: the installed software and the operating system (OS) search PATH. Most modsim developers will be familiar with installing software on their local computer. When running a simulation workflow from a command line shell, the system PATH variable tells the shell how to find executable software. System paths may be modified by user configuration files; however, managing and maintaining a common set of configuration files among several servers and multiple collaborators can be a difficult and error prone task, often resulting in the common “but it works on my computer” complaint.

Instead of managing an ever growing case or if structure to handle system changes out of your control, one solution to managing a common environment is to keep a version controlled environment file used for creating a common virtual compute environment. Virtual environments will handle PATH manipulation and many come with associated package managers that can install software to a common location managed by the virtual environment. The software installed by the package manager is then accessed by activation of the virtual environment, which handles the OS and system specific PATH changes for you. This pushes the difficulties of managing a common project environment into the realm of version control, where developers now only need to re-build their virtual environment when the version controlled environment file changes. When combined with continuous integration, any commonly used servers can have the shared environment updated automatically on changes to the environment file.

In the WAVES project, a minimal Conda environment file is maintained under version control and used for Regression Testing, shared compute environment management, and packaging with a continuous integration server for automatic testing and deployment on every change to the project source code. A similar MODSIM-TEMPLATE environment file is included in the tutorials as a starting point for modsim repository environment management. The Anaconda documentation includes good tutorials and references for Conda environment management. Conda was chosen as the package manager and virtual environment tool due to the popularity of Python as a scripting language and the Python-centric nature of the scripting APIs for many engineering analysis software tools.

While Conda is known primarily for Python package management, it is a general package manager and can package, deploy, and manage packages of many different programming languages and mixed language packages. This makes it an excellent tool for developing supporting Fortran, C, or C++ subroutines that most engineering analysis software allows to extend their built-in behavior. As an engineering project grows and collaborates with material modeling and other research projects, it may be beneficial to package supporting libraries and packages for conda-forge or even a self-hosted custom conda channel.

There are several engineering numeric solvers developed by the open-source community and some are even distributed by package managers, including Conda. However, most commercial engineering analysis software is not packaged for a package manager and instead provides licenced installation media. In this case, it may be difficult to include the commercial software in a virtual environment.

Most Build System software offer solutions to executable path management separately from the activated virtual environment and allow mixed use of software discovered on PATH and specified by absolute path. In the SCons builder system, users are encouraged to explicitly specify software without relying on PATH to avoid ambiguous system configuration requirements. Another benefit of the explicit SCons construction environment is the ability to define tasks with mutually incompatible execution environments. For instance, if one portion of the workflow requires a numeric solver with dependencies that are incompatible with the post-processing task, SCons can configure both tasks with a unique construction environment [32]. This allows the project to maintain a single, uninterrupted workflow, despite the conflicting software requirements. This approach is demonstrated in several WAVES Supplemental Lessons using the waves.scons_extensions.shell_environment() function.

A more generalized solution has been adopted by high performance computing, which often requires the system administrators to provide multiple, conflicting versions of software side-by-side. Environment modules and the module software help users manage a project-specific environment on a multi-user compute server. For collaborative efforts and version controlled consistency, system PATH management with a project specific module file can help tie together system resources and a virtual environment in a common way for all contributors.