BLR Preconditioning

Apart from hierarchically semi-separable rank-structured matrices, the sparse multifrontal solver can also use Block Low-Rank (BLR) matrices to compress the fill-in. In the multifrontal method, computations are performed on dense matrices called frontal matrices. A frontal matrix can be approximated as a BLR matrix, but this will only be beneficial (compared to storing the frontal as a standard dense matrix and operating on it with BLAS/LAPACK routines) if the frontal matrix is large enough.

Rank-structured compression is not used by default in the STRUMPACK sparse solver (the default is to perform exact LU factorization), but BLR compression can be turned on/off via the command line:

--sp_compression blr
--sp_compression none

or via the C++ API as follows

void strumpack::SPOptions::set_compression(CompressionType::BLR);
void strumpack::SPOptions::set_compression(CompressionType::NONE);
CompressionType strumpack::SPOptions::compression() const; // get compression type (none/hss/blr/hodlr)
CompressionType compression() const
Definition: StrumpackOptions.hpp:910
void set_compression(CompressionType c)
Definition: StrumpackOptions.hpp:546
CompressionType
Definition: StrumpackOptions.hpp:91

When BLR compression is enabled, the default STRUMPACK behavior is to use the BLR enabled approximate LU factorization as a preconditioner within GMRES. This behavior can also be changed, see Solve.

The above options affect the use of BLR within the multifrontal solver. There are more, BLR specific, options which are stored in an object of type BLR::BLROptions<scalar>. An object of this type is stored in the SPOptions<scalar> object stored in the StrumpackSparseSolver. It can be accessed via the BLR_options() routine as follows:

strumpack::StrumpackSparseSolver<double> sp; // create solver object
sp.options().set_compression(CompressionType::BLR); // enable BLR compression in the multifrontal solver
sp.options().BLR_options().set_leaf_size(256); // set the BLR leaf size
SPOptions< scalar_t > & options()
SparseSolver is the main sequential or multithreaded sparse solver class.
Definition: StrumpackSparseSolver.hpp:75

The compression tolerances can greatly impact performance. They can be set using:

--blr_rel_tol real (default 0.01)
--blr_abs_tol real (default 1e-08)

or via the C++ API

real strumpack::BLR::BLROptions<scalar>::rel_tol() const; // get the current value
Class containing several options for the BLR code and data-structures.
Definition: BLROptions.hpp:82

The peak memory usage of the solver can be reduced by enabling columnwise construction (instead of DENSE):

--blr_cb_construction COLWISE
--blr_cb_construction DENSE

or in C++

sp.options().BLR_options().set_CB_construction(CBConstruction::COLWISE); // set this option through the sparse solver object sp
sp.options().BLR_options().set_CB_construction(CBConstruction::DENSE);

COLWISE will construct the BLR matrix one block column at a time, and will also compress the 22 part of a front. Hence, COLWISE uses less memory, but is slower and can be less accurate than DENSE.

BLR compression has some overhead and only pays off for sufficiently large matrices. Therefore STRUMPACK has tuning parameters to specify the minimum size a dense matrix needs to be to be considered a candidate for BLR compression. The following routines can be used to tune how many fronts are compressed, via the command line:

--sp_compression_min_sep_size int
--sp_compression_min_front_size int

or via the C++ API as follows

int compression_min_front_size(int l=0) const
Definition: StrumpackOptions.hpp:1021
int compression_min_sep_size(int l=0) const
Definition: StrumpackOptions.hpp:978
void set_compression_min_sep_size(int s)
Definition: StrumpackOptions.hpp:596
void set_compression_min_front_size(int s)
Definition: StrumpackOptions.hpp:629

The routine set_compression_min_sep_size(int s) refers to the size of the top-left sub-block of a front only. This top-left block is the part that corresponds to a separator, as given by the nested dissection reordering algorithm. This top-left block is also referred to as the block containing the fully-summed variable. Factorization is only applied to this top-left block. Tuning the value for the minimum separator size can have a big impact on performance and memory usage!

Here is a list of all command line options:

# BLR Options:
# --blr_rel_tol real_t (default 0.0001)
# --blr_abs_tol real_t (default 1e-10)
# --blr_leaf_size int (default 256)
# --blr_max_rank int (default 5000)
# --blr_low_rank_algorithm (default RRQR)
# should be [RRQR|ACA|BACA]
# --blr_admissibility (default weak)
# should be one of [weak|strong]
# --blr_factor_algorithm (default Star)
# should be [RL|LL|Comb|Star]
# --blr_compression_kernel (default half)
# should be [full|half]
# --blr_cb (default DENSE)
# should be [COLWISE|DENSE]
# --blr_BACA_blocksize int (default 4)
# --blr_verbose or -v (default false)
# --blr_quiet or -q (default true)
# --help or -h