FAQ

"Library not loaded" error at runtime

On MAC you probably need to set

export DYLD_LIBRARY_PATH=PATH-TO-STRUMPACK-LIB/:$DYLD_LIBRARY_PATH
export LD_LIBRARY_PATH=PATH-TO-STRUMPACK-LIB/:$LD_LIBRARY_PATH

The code crashes with a segmentation fault whenever I use more than one MPI rank!?

This is most likely due to an incorrect library being linked. When using Intel MKL, make sure to use LP64 interface, and not the ILP64 interface. The LP64 mkl library uses 32-bit indexing, while the ILP64 libraries use 64-bit integers for indexing. In STRUMPACK we use 32 bit indexing for BLAS, LAPACK and ScaLAPACK. We recommend using the IntelĀ® Math Kernel Library Link Line Advisor:

https://software.intel.com/content/www/us/en/develop/articles/intel-mkl-link-line-advisor.html

Help, the compiler cannot find the "chrono" header:

catastrophic error: cannot open source file "chrono"
#include <chrono>

You need a C++11 capable compiler, and also a C++11 enabled standard library. For instance suppose you are using the Intel 15.0 C++ compiler with GCC 4.4 headers. The Intel 15.0 C++ compiler supports the C++11 standard, but the GCC 4.4 headers do not implement the C++11 standard library. You should install/load a newer GCC version (or just the headers). On cray machines, this can be done with module unload gcc; module load gcc/4.9.3 for instance.

When running "make test", many of the tests fail!

The parallel execution in ctest is invoked by the MPIEXEC command as discovered by CMake. On many HPC clusters, this does not run unless it is executed from within a batch script. In this case all parallel tests will fail.

Some MPI environments do not allow you to oversubscribe. For instance recent OpenMPI versions need the additional

--oversubscribe

flag. Several tests use up to 19 mpi processes. If you have less than 19 cores in your system, and the mpi environment does not allow oversubscription, those tests will fail.

Does the code keep track of the number of floating point operations performed?

To keep track of the number of floating point operations performed in the STRUMPACK Sparse Solver, you can run CMake with:

-DSTRUMPACK_COUNT_FLOPS=ON

Then, when running, do not set the quiet flag in the StrumpackSparseSolver constructor or on the command line and the solver will print some statistics. This will also enable a counter for data movement in the solve phase, from which the (approximately) attained bandwidth usage is derived. This is done because the solve phase is typically bandwidth limited, while the factorization is flop limited.