GPU Support

STRUMPACK currently only supports for GPU acceleration in the sparse direct solver. None of the preconditioners or rank structured solvers currently support GPU acceleration.

The sparse direct solver performs most of its computations using BLAS and LAPACK, and in the distributed memory setting also ScaLAPACK. For the BLAS and LAPACK operations, we use CUDA, cuBLAS and cuSOLVER for acceleration using NVIDIA GPUs. As a ScaLAPACK alternative with GPU off-loading capabilities we use SLATE:
https://github.com/icl-utk-edu/slate

See the Installation and Requirements page for instructions on how to build STRUMPACK with support for CUDA, and optionally SLATE.


CUDA Support for NVidia GPUs

CUDA support will be enabled by default, if it can be detected. The GPU accelerated code has the same interfaces as the CPU code, meaning the input and output data is always expected to reside in host memory. If STRUMPACK is compiled with CUDA support, GPU acceleration will be enabled by default in the sparse direct solver. GPU acceleration can still be enabled/disabled using the command line arguments:

--sp_enable_gpu
--sp_disable_gpu

or in the code:

void disable_gpu()
Definition: StrumpackOptions.hpp:740
void enable_gpu()
Definition: StrumpackOptions.hpp:735

The number of GPU streams per MPI rank can be set with:

--sp_gpu_streams (default 4)

or in the code:

void set_gpu_streams(int s)
Definition: StrumpackOptions.hpp:745
int gpu_streams() const
Definition: StrumpackOptions.hpp:1137

HIP Support for AMD GPUs

There is also support for AMD GPUs through HIP and the ROCm libraries, rocSOLVER and hipBLAS. Support for HIP can be enabled through the CMake build using:

> cmake ../ -DSTRUMPACK_USE_HIP=ON -DHIP_HIPCC_FLAGS=--amdgpu-target=gfx906

where the user can specify the specific GPU architecture. If the CMake build system detects CUDA support, then HIP will be disabled.