Cori Large Memory software¶

Many software packages available from Cori and other external compute nodes (e.g., Cori GPUs) work on the large memory nodes (aka cmem nodes), too, since they are built with x86 and x64 instruction sets. However, some packages, meant to be used with CUDA on GPUs or to be for a different interconnect fabric, don't work. In that case, we have built separate cmem-specific versions, and differentiate the module names from those for the existing ones, by appending -cmem to the regular names.

If you see module file names for a package, some with the trailing -cmem and others without, you should use one that ends with -cmem. For example, we show below all the llvm versions available, but only the one for llvm-cmem/10.0.1 works on the large memory nodes.

To use modules for the large memory nodes, you can module load cmem.

cmem$ module avail llvm
--------- /global/common/software/nersc/osuse15_cmem/extra_modulefiles ---------
llvm-cmem/10.0.1(default)

------------- /global/common/software/nersc/cle7/extra_modulefiles -------------
llvm/8.0.1
llvm/9.0.0-git_20190220_cuda_10.1.168(default)
llvm/9.0.1
llvm/10.0.0
llvm/10.0.0-git_20190828
llvm/11.0.0-git_20200409
llvm_openmp_debug/9.0.0-git_20190220_cuda_10.1.168(default)
llvm_openmp_debug/10.0.0-git_20190828
llvm_openmp_debug/11.0.0-git_20200409

Compilers¶

There are several base compilers available.

CCE (Cray Compiler)¶

cmem$ module load PrgEnv-cray
cmem$ export CRAY_CPU_TARGET=x86-64   # set the CPU target to x86-64

GCC¶

cmem$ module load gcc

It is suggested to add the processor specific compiler option -march=znver1 for better performance codes.

Intel¶

cmem$ module load intel

For better performance codes, you can make use of the processor specific compiler option for better performances, by adding -xHOST, and make sure that you compile on a cmem node directly.

LLVM¶

cmem$ module load cmem
cmem$ module load llvm-cmem

PGI¶

cmem$ module load pgi

It is suggested to use the processor specific compiler option by adding -tp=zen.

MPI¶

Open MPI¶

Open MPI is provided for the GCC, HPC SDK (formerly PGI), Intel, and CCE compilers, and is provided as the openmpi-cmem/4.0.3 and openmpi-cmem/4.0.5 modules. Users must use this particular version of the openmpi-cmem module.

One must first load a compiler module before loading a openmpi-cmem module, e.g.,

cmem$ module load cmem
cmem$ module load gcc
cmem$ module load openmpi-cmem

After the openmpi-cmem module is loaded, the MPI compiler wrappers will be available as mpicc, mpic++, and mpif90.

Python¶

Python use on the largemem nodes is largely the same as on Cori. You can find general information about using Python at NERSC here.

Two main differences Python users should be aware of on the largemem nodes are mpi4py (documented below) and performance issues in libraries that use Intel's MKL like NumPy, SciPy, and scikit-learn. You can find more information about improving MKL performance on AMD CPUs here. Alternatively, in NumPy for example, you can use OpenBLAS instead of MKL by installing NumPy from the conda-forge channel (more info here). You can also use the nomkl package documented here.

Using Python mpi4py¶

Using Python's mpi4py on the Large Memory nodes requires an mpi4py built with Open MPI. This means that the mpi4py in our default Python module will not work on these nodes. It also means that any custom conda environments built with Cray MPICH (following our standard build recipe) will also not work on the Large Memory nodes.

We provide two options for users:

1) Build mpi4py against Open MPI in your own conda environment:

module load cmem
module load python
module swap PrgEnv-intel PrgEnv-gnu
module load openmpi-cmem

conda create -n mylargemem python=3.8 -y
source activate mylargemem

cd $HOME
wget https://bitbucket.org/mpi4py/mpi4py/downloads/mpi4py-3.0.3.tar.gz
tar zxvf mpi4py-3.0.3.tar.gz
cd mpi4py-3.0.3
python setup.py build
python setup.py install

OR

2) Start with our pre-built mpi4py for the Large Memory nodes by cloning an environment:

module load python
conda create -n mylargemem --clone lazy-mpi4py-amd
source activate mylargemem

To run with Slurm:

srun -n 2 python hello_world.py

To run with Open MPI's mpirun:

module load cmem
module load openmpi-cmem
mpirun -n 2 python hello_world.py

Q-Chem¶

cmem$ module load qchem-cmem

VASP¶

cmem$ module load vasp-cmem