SuperLU is a general purpose library for the direct solution of large, sparse,
nonsymmetric systems of linear equations.
The library is written in C and is callable from either C or Fortran program.
It uses MPI, OpenMP and CUDA to support various forms of parallelism.
It supports both real and complex datatypes, both single and double
precision, and 64-bit integer indexing.
The library routines performs an LU decomposition with partial pivoting and
triangular system solves through forward and back substitution. The LU
factorization routines can handle non-square matrices but the triangular solves
are performed only for square matrices. The matrix columns may be preordered
(before factorization) either through library or user supplied routines.
This preordering for sparsity is completely separate from the factorization.
Working precision iterative refinement subroutines are provided for
improved backward stability. Routines are also provided to equilibrate the system,
estimate the condition number, calculate the relative backward error,
and estimate error bounds for the refined solutions.
Serial SuperLU package also contains ILU routines, using
numerical threshold-based dropping, with partial pivoting (ILUTP).
SuperLU package comes in three different flavors:
FAQ (Frequently Asked Questions)
The Users' Guide (Tech report LBNL-44289) describes all three libraries. (Last update: June 2018)
How to Cite SuperLU in a publication.
User Mailing List is used to announce changes, new releases, etc.
Please send email if you have used any versions of the library.
This is my survey article about sparse direct solvers of various flavours.
Usage of SuperLU (page is under construction)
This project has been funded by DOE, NSF and DARPA.
Developers:
     X. Sherry Li
    
Wajih Boukaram
    
Jim Demmel
    
Nan Ding
    
John Gilbert
    
Laura Grigori
    
Yang Liu
    
Piyush Sao
    
Meiyue Shao
    
Ichitaro Yamazaki
Other Contributors:
     Pietro Cicotti, UCSD
     Daniel Schreiber
     Jinqchong Teo
     Yu Wang
     Eric Zhang, Albany High
SuperLU Version 7.0.0
SuperLU has achieved up to 40% of the theoretical floating-point rate
on a number of workstations, such as MIPS R8000 and IBM RS/6000.
The megaflop rate usually increases with increasing ratio of
floating-point operations count over the number of nonzeros in the
L and U factors.
(See details in this paper published in ACM Trans. Math. Software,
Vol. 37, No. 4, Article No. 43, April 2011.)
SuperLU_MT Version 4.0.0
Provide Pthreads and OpenMP interfaces.
There are also parallel directives for several older SMPs.
SuperLU_MT demonstrated 5--10 fold speedups on a range of commercially
popular SMPs, and up to 2.5 Gigaflops factorization rate.
SuperLU_DIST Version 9.0.0
GESP stands for Gaussian Elimination with "Static Pivoting". Static pivoting
is a technique that combines the numerical stability
of partial pivoting with the scalability of no pivoting,
to run accurately and efficiently on large numbers of processors.
SuperLU_DIST demonstrated up to 100 fold speedup on the 512-PE Cray T3E
at NERSC, and 10.2 Gigaflops factorization rate, using MPI.
The scientific result was reported earlier in the cover article
of Science, Dec 24, 1999.
(Requires C++ compiler)
(Cite paper:
A Distributed-Memory Algorithm for Computing a Heavy-Weight Perfect Matching on
Bipartite Graphs.)