STRUMPACK currently only supports for GPU acceleration in the sparse direct solver. None of the preconditioners or rank structured solvers currently support GPU acceleration.

The sparse direct solver performs most of its computations using BLAS and LAPACK, and in the distributed memory setting also ScaLAPACK. For the BLAS and LAPACK operations, we use CUDA, cuBLAS and cuSOLVER for acceleration using NVIDIA GPUs. As a ScaLAPACK alternative with GPU off-loading capabilities we use SLATE:
https://github.com/icl-utk-edu/slate

See the Installation and Requirements page for instructions on how to build STRUMPACK with support for CUDA, and optionally SLATE.

CUDA Support for NVidia GPUs

CUDA support will be enabled by default, if it can be detected. The GPU accelerated code has the same interfaces as the CPU code, meaning the input and output data is always expected to reside in host memory. If STRUMPACK is compiled with CUDA support, GPU acceleration will be enabled by default in the sparse direct solver. GPU acceleration can still be enabled/disabled using the command line arguments:

--sp_enable_gpu

--sp_disable_gpu

or in the code:

void strumpack::SPOptions::enable_gpu();

void strumpack::SPOptions::disable_gpu();

strumpack::SPOptions::disable_gpu

void disable_gpu()

Definition StrumpackOptions.hpp:750

strumpack::SPOptions::enable_gpu

void enable_gpu()

Definition StrumpackOptions.hpp:745

The number of GPU streams per MPI rank can be set with:

--sp_gpu_streams (default 4)

or in the code:

void strumpack::SPOptions::set_gpu_streams(int s);

int strumpack::SPOptions::gpu_streams() const;

strumpack::SPOptions::set_gpu_streams

void set_gpu_streams(int s)

Definition StrumpackOptions.hpp:776

strumpack::SPOptions::gpu_streams

int gpu_streams() const

Definition StrumpackOptions.hpp:1192

HIP Support for AMD GPUs

There is also support for AMD GPUs through HIP and the ROCm libraries, rocSOLVER and hipBLAS. Support for HIP can be enabled through the CMake build using:

> export HIP_DIR=....

> cmake ../ -DSTRUMPACK_USE_HIP=ON -DCMAKE_HIP_ARCHITECTURES=gfx90a -DCMAKE_CXX_COMPILER=hipcc

where the user can specify the specific GPU architecture. If the CMake build system detects CUDA support, then HIP will be disabled.