C/C++ Interface and Example

Interface to construct, factor and solve a hierarchical matrix

Similar to the Fortran interface, the C/C++ interface is named with the prefix "x_". x='z' for double-complex, 'd' for double-real, 'c' for single-complex, and 's' for single-complex. Take double-real precision for example, the caller needs to first define a class/object that can perform either matrix entry evaluation or matrix vector multiplication:

//provide a user-defined class consisting all data and metadata needed for this matrix entry evaluation and/or matvec
class C_QuantApp {
//define your data or metadata here
};
// The entry evaluation function wrapper required by the Fortran code, val returns Z(m,n), F2Cptr is an alias of void*. 1<=m<=N, 1<=n<=N.
inline void C_FuncZmn(int *m, int *n, double *val, F2Cptr quant) {
C_QuantApp* Q = (C_QuantApp*) quant;
//call your entry evaluation function defined in C_QuantApp using Q
}
// The entry evaluation function wrapper required by the Fortran code, alldat_loc returns all local entries of Z, F2Cptr is an alias of void*.
inline void C_FuncZmnBlock(int* Ninter, int* Nallrows, int* Nallcols, int* Nalldat_loc, int* allrows, int* allcols, double* alldat_loc, int* rowidx,int* colidx, int* pgidx, int* Npmap, int* pmaps, C2Fptr quant) {
C_QuantApp* Q = (C_QuantApp*) quant;
// allrows: 1D array containing the global row indices (in original order starting from 1 to N) stacked together
// allcols: 1D array containing the global column indices (in original order starting from 1 to N) stacked together
// alldat_loc: 1D array containing the local entry values defined by pmaps (in column major) stacked together
// colidx: 1D array containing sizes of columns of each block
// rowidx: 1D array containing sizes of rows of each block
// pgidx: 1D array containing the process group number of each block, the number starts from 0
// Npmap: number of process groups
// pmaps: 2D array (Npmapx3) containing number of process rows, number of process columns, and starting process ID in each block
//call your entry evaluation function defined in C_QuantApp using Q
}
// The matvec function wrapper required by the Fortran libraries, see "subroutine MatVec" of the Fortran interface for meanings of the arguments
inline void C_FuncHMatVec(char const *trans, int *nin, int *nout, int *nvec, double const *xin, double *xout, C2Fptr quant) {
C_QuantApp* Q = (C_QuantApp*) quant;
//call your matvec function defined in C_QuantApp using Q
}

Initialize ButterflyPACK metadata:

F2Cptr bmat; //hierarchical matrix returned by Fortran code
F2Cptr option; //option structure returned by Fortran code
F2Cptr stats; //statistics structure returned by Fortran code
F2Cptr msh; //mesh structure returned by Fortran code
F2Cptr kerregister; //kernel register quantities structure returned by Fortran code
F2Cptr ptree; //process tree returned by Fortran code
MPI_Fint Fcomm; // the Fortran MPI communicator
Fcomm = MPI_Comm_c2f(Comm); //Comm is the C MPI communicator provided by the user
// initialize ButterflyPACK metadata
d_c_bpack_createptree(&nmpi, groups, &Fcomm, &ptree); //groups is a int array of size nmpi holding MPI ranks out of communicator Fcomm used for ButterflyPACK
// set ButterflyPACK options other than the default ones, the double and integer options are set with different functions:
d_c_bpack_set_D_option(&option, "name", val); //double-valued option
d_c_bpack_set_I_option(&option, "name", val); //int-valued option

Initialization of the construction phase

d_c_bpack_construct_init(&N, &Ndim, coordinates, nns_ptr, &nlevel, clustertree, P, &N_loc, &bmat, &option, &stats, &msh, &kerregister, &ptree, &C_FuncDistmn, &C_FuncNearFar, quant);
//N is matrix dimension
//coordinates is a double array of size N*Ndim representing Cartesian coordinates x1(1),...,x1(Ndim),x2(1),...,x2(Ndim)....
//if Ndim=0, coordinates is not referenced
//nns_ptr is a int array of size N*option%knn representing the knn nearest neighouring points of each of the N points. If option%knn=0, nns_ptr is not referenced.
//clustertree is an integer array of size 2^nlevel containing leafsizes in a user-provided cluster tree
//if nlevel=0, input requires tree(1)=N
//P is an integer array of size N, representing permutation vector returned by the ButterflyPACK clustering
//N_loc is the local matrix dimension, P and N_loc can be used to define your matvec if needed
//C_FuncDistmn: pointer to user-provided function to compute distance between any row and column of the matrix. Not referenced if option%nogeo is not 2.
//C_FuncNearFar: pointer to user-provided function to determine whether a block (in permuted order) is compressible or not. Not referenced if option%nogeo is not 2.

Construction of the hierarchical matrix with entry evaluation:

d_c_bpack_construct_element_compute(&bmat, &option, &stats, &msh, &kerquant, &ptree, &C_FuncZmn, &C_FuncZmnBlock, quant);
//C_FuncZmn: pointer to user-provided function to sample mn^th entry of the matrix. Used when option%elem_extract=0.
//C_FuncZmnBlock: pointer to user-provided function to sample a list of intersections of entries of the matrix. Used when option%elem_extract>=1.

Construction of the hierarchical matrix with matrix-vector multiplication:

d_c_bpack_construct_matvec_compute(&bmat, &option, &stats, &msh, &kerregister, &ptree, &C_FuncHMatVec, quant);
//C_FuncHMatVec: pointer to user-provided function to multiply A and A* with vectors (see above)

Factorization of the hierarchical matrix:

d_c_bpack_factor(&bmat,&option,&stats,&ptree,&msh);

Solve of the hierarchical matrix:

d_c_bpack_solve(x,b,&N_loc,&nrhs,&bmat,&option,&stats,&ptree);
// bmat is the meta-data storing the compressed and factored matrix
// N_loc is the local number of rows/columns
// nrhs is the number of right-hand sides
// b of N_loc*nrhs is the local right-hand sides (concatenation of nrhs vectors of length N_loc)
// x of N_loc*nrhs is the local solution vector (concatenation of nrhs vectors of length N_loc)

Interface to construct, and multiply a low-rank/butterfly block

ButterflyPACK also provides an interface for constructing a single MxN low-rank/butterfly block. Take double-real precision for example, the caller needs to first define a class/object that can perform either matrix entry evaluation or matrix vector multiplication:

//provide a user-defined class consisting all data and metadata needed for this matrix entry evaluation and/or matvec
class C_QuantApp {
//define your data or metadata here
};
// The entry evaluation function wrapper required by the Fortran code, val returns Z(m,n), F2Cptr is an alias of void*. If n<0 (column index) and m>0 (row index), then 1<=m<=M, -N<=n<=-1; if m<0 (column index) and n>0 (row index), then 1<=n<=M, -N<=m<=-1.
inline void C_FuncBZmn(int *m, int *n, double *val, F2Cptr quant) {
C_QuantApp* Q = (C_QuantApp*) quant;
//call your entry evaluation function defined in C_QuantApp using Q
}
// The entry evaluation function wrapper required by the Fortran code, alldat_loc returns all local entries of Z, F2Cptr is an alias of void*.
inline void C_FuncZmnBlock(int* Ninter, int* Nallrows, int* Nallcols, int* Nalldat_loc, int* allrows, int* allcols, double* alldat_loc, int* rowidx,int* colidx, int* pgidx, int* Npmap, int* pmaps, C2Fptr quant) {
C_QuantApp* Q = (C_QuantApp*) quant;
// allrows: 1D array containing the global row indices (in original order starting from 1 to N) stacked together
// allcols: 1D array containing the global column indices (in original order starting from 1 to N) stacked together
// alldat_loc: 1D array containing the local entry values defined by pmaps (in column major) stacked together
// colidx: 1D array containing sizes of columns of each block
// rowidx: 1D array containing sizes of rows of each block
// pgidx: 1D array containing the process group number of each block, the number starts from 0
// Npmap: number of process groups
// pmaps: 2D array (Npmapx3) containing number of process rows, number of process columns, and starting process ID in each block
//call your entry evaluation function defined in C_QuantApp using Q
}
// The matvec function wrapper required by the Fortran libraries.
inline void C_FuncBMatVec(char const *trans, int *nin, int *nout, int *nvec, double const *xin, double *xout, C2Fptr quant, double *a, double *b){
C_QuantApp* Q = (C_QuantApp*) quant;
//call your matvec function defined in C_QuantApp using Q
}

Initialize ButterflyPACK metadata:

F2Cptr bf_b; //hierarchical matrix returned by Fortran code
F2Cptr option_b; //option structure returned by Fortran code
F2Cptr stats_b; //statistics structure returned by Fortran code
F2Cptr msh_b; //mesh structure returned by Fortran code
F2Cptr kerregister_b; //kernel register quantities structure returned by Fortran code
F2Cptr ptree_b; //process tree returned by Fortran code
MPI_Fint Fcomm; // the Fortran MPI communicator
Fcomm = MPI_Comm_c2f(Comm); //Comm is the C MPI communicator provided by the user
// initialize ButterflyPACK metadata
d_c_bpack_createptree(&nmpi, groups, &Fcomm, &ptree_b); //groups is a int array of size nmpi holding MPI ranks out of communicator Fcomm used for ButterflyPACK
// set ButterflyPACK options other than the default ones, the double and integer options are set with different functions:
d_c_bpack_set_D_option(&option_b, "name", val); //double-valued option
d_c_bpack_set_I_option(&option_b, "name", val); //int-valued option

Initialization of the construction phase. In this interface, ButterflyPACK treats the single block as the 1-2 off-diagonal block of a larger hierarchical matrix.

// one needs to first a mesh metadata for the row and column dimension, by calling d_c_bpack_construct_init. The *_dummy argument is not referenced.
d_c_bpack_construct_init(&M, &Ndim, dat_ptr_m, nns_ptr_m,&nlevel_m, tree_m, perms_m, &myseg_m, &bmat_dummy, &option_b, &stats_dummy, &msh0_m, &kerquant_dummy, &ptree_b, &C_FuncDistmn_dummy, &C_FuncNearFar_dummy, quant_ptr_dummy);
d_c_bpack_construct_init(&N, &Ndim, dat_ptr_n, nns_ptr_n,&nlevel_n, tree_n, perms_n, &myseg_n, &bmat_dummy, &option_b, &stats_dummy, &msh0_n, &kerquant_dummy, &ptree_b, &C_FuncDistmn_dummy, &C_FuncNearFar_dummy, quant_ptr_dummy);
// the construction initialization can be done by:
d_c_bf_construct_init(&M, &N, &myseg_m, &myseg_n, nns_ptr_m, nns_ptr_n, &msh0_m, &msh0_n, &bf_b, &option_b, &stats_b, &msh_b, &kerregister_b, &ptree_b,&C_FuncDistmn, &C_FuncNearFar, quant);

Construction of the block with entry evaluation:

d_c_bf_construct_element_compute(&bf_b, &option_b, &stats_b, &msh_b, &kerregister_b, &ptree_b, &C_FuncBZmn, &C_FuncBZmnBlock, quant);
//C_FuncBZmn: pointer to user-provided function to sample mn^th entry of the matrix. Used when option%elem_extract=0.
//C_FuncBZmnBlock: pointer to user-provided function to sample a list of intersections of entries of the matrix. Used when option%elem_extract>=1.

Construction of the block with matrix-vector multiplication:

d_c_bf_construct_matvec_compute(&bf_b, &option_b, &stats_b, &msh_b, &kerregister_b, &ptree_b, &C_FuncBMatVec, quant);
//C_FuncBMatVec: pointer to user-provided function to multiply A and A* with vectors (see above)

Application of the constructed block:

c_bf_mult(trans, xin, xout, Ninloc, Noutloc, Nvec, bf_b, option_b, stats_b, ptree_b);
// trans: 'N', 'C', or 'T'
// xin: input vector of Ninloc*Nvec (concatenation of Nvec vectors of length Ninloc)
// xout: output vector of Noutloc*Nvec (concatenation of Nvec vectors of length Noutloc)
// Ninloc: length of local input vectors
// Noutloc: length of local output vectors
// Nvec: number of vectors

In addition, ButterflyPACK can be invoked from STRUMPACK with more compact C++ interfaces than the above. See http://portal.nersc.gov/project/sparse/strumpack/ for details.

Example

A number of examples are available in the build/EXAMPLE folder. You can modify options in the driver or from command line.

InterfaceTest.cpp: An example using a collection of simple kernels. The kernels are selected by the parameter "ker". ker=1: RBF, 2: R^4, 3: sqrt(R^2+h), 4: 1/sqrt(R^2+h), 5: (X^tY+h)^2, 6: product of two random matrices. The coordinates are generated as follows: tst=1 read from a UCI machine learning dataset, tst=2 generates a random collection of coordinates, tst=3 or 4: no coordinates are generated. This example first constructs (with entry evaluation), factor a hierarchical matrix and then uses it as matrix-vector multiplication to construct a second hierarchical matrix.

mpirun -n nmpi ./EXAMPLE/ctest

Taylor2D.cpp: An example that solves a 2D EFIE assuming inhomogeneous media. A approximate Green's function is used to evaluate any entry of the matrix.

mpirun -n nmpi ./EXAMPLE/go2d

FIO_Driver.cpp: An example that constructs two 1D Fourier integral operators (FIOs) as two butterfly-compressed blocks using entry-evaluation-based APIs, then construct their product as another butterfly-compressed block using matvec-based APIs.

mpirun -n nmpi ./EXAMPLE/cfio

InverseFIO_Driver.cpp: An example that constructs a 1D FIO as a butterlfy block A, computes a rhs by b=A*x_ref, then generates an approximate solution using (A^*A)^-1A^*b. Then matrix A^*A is compressed as an HODLR matrix using matvec-based APIs.

mpirun -n nmpi ./EXAMPLE/cifio