Apart from hierarchically semi-separable rank-structured matrices, the sparse multifrontal solver can also use Block Low-Rank (BLR) matrices to compress the fill-in. In the multifrontal method, computations are performed on dense matrices called frontal matrices. A frontal matrix can be approximated as a BLR matrix, but this will only be beneficial (compared to storing the frontal as a standard dense matrix and operating on it with BLAS/LAPACK routines) if the frontal matrix is large enough.
Rank-structured compression is not used by default in the STRUMPACK sparse solver (the default is to perform exact LU factorization), but BLR compression can be turned on/off via the command line:
or via the C++ API as follows
When BLR compression is enabled, the default STRUMPACK behavior is to use the BLR enabled approximate LU factorization as a preconditioner within GMRES. This behavior can also be changed, see Solve.
The above options affect the use of BLR within the multifrontal solver. There are more, BLR specific, options which are stored in an object of type BLR::BLROptions<scalar>. An object of this type is stored in the SPOptions<scalar> object stored in the StrumpackSparseSolver. It can be accessed via the BLR_options() routine as follows:
The compression tolerances can greatly impact performance. They can be set using:
or via the C++ API
The peak memory usage of the solver can be reduced by enabling columnwise construction (instead of DENSE):
or in C++
COLWISE will construct the BLR matrix one block column at a time, and will also compress the 22 part of a front. Hence, COLWISE uses less memory, but is slower and can be less accurate than DENSE.
BLR compression has some overhead and only pays off for sufficiently large matrices. Therefore STRUMPACK has tuning parameters to specify the minimum size a dense matrix needs to be to be considered a candidate for BLR compression. The following routines can be used to tune how many fronts are compressed, via the command line:
or via the C++ API as follows
The routine set_compression_min_sep_size(int s) refers to the size of the top-left sub-block of a front only. This top-left block is the part that corresponds to a separator, as given by the nested dissection reordering algorithm. This top-left block is also referred to as the block containing the fully-summed variable. Factorization is only applied to this top-left block. Tuning the value for the minimum separator size can have a big impact on performance and memory usage!
Here is a list of all command line options: