Loading...
Searching...
No Matches
GPU Support

STRUMPACK currently only supports for GPU acceleration in the sparse direct solver. None of the preconditioners or rank structured solvers currently support GPU acceleration.

The sparse direct solver performs most of its computations using BLAS and LAPACK, and in the distributed memory setting also ScaLAPACK. For the BLAS and LAPACK operations, we use CUDA, cuBLAS and cuSOLVER for acceleration using NVIDIA GPUs. As a ScaLAPACK alternative with GPU off-loading capabilities we use SLATE:
https://github.com/icl-utk-edu/slate

See the Installation and Requirements page for instructions on how to build STRUMPACK with support for CUDA, and optionally SLATE.


CUDA Support for NVidia GPUs

CUDA support will be enabled by default, if it can be detected. The GPU accelerated code has the same interfaces as the CPU code, meaning the input and output data is always expected to reside in host memory. If STRUMPACK is compiled with CUDA support, GPU acceleration will be enabled by default in the sparse direct solver. GPU acceleration can still be enabled/disabled using the command line arguments:

--sp_enable_gpu
--sp_disable_gpu

or in the code:

void disable_gpu()
Definition StrumpackOptions.hpp:750
void enable_gpu()
Definition StrumpackOptions.hpp:745

The number of GPU streams per MPI rank can be set with:

--sp_gpu_streams (default 4)

or in the code:

void set_gpu_streams(int s)
Definition StrumpackOptions.hpp:776
int gpu_streams() const
Definition StrumpackOptions.hpp:1192

HIP Support for AMD GPUs

There is also support for AMD GPUs through HIP and the ROCm libraries, rocSOLVER and hipBLAS. Support for HIP can be enabled through the CMake build using:

> export HIP_DIR=....
> cmake ../ -DSTRUMPACK_USE_HIP=ON -DCMAKE_HIP_ARCHITECTURES=gfx90a -DCMAKE_CXX_COMPILER=hipcc

where the user can specify the specific GPU architecture. If the CMake build system detects CUDA support, then HIP will be disabled.