Gets matrix permutation. More...

#include <limits.h>
#include <math.h>
#include "superlu_dist_config.h"
#include "superlu_ddefs.h"

Include dependency graph for get_perm_c_parmetis.c:

Functions
static float	a_plus_at_CompRow_loc (int, int_t , int, int_t , int_t, int_t , int_t , int, int_t , int_t , int_t , int_t , gridinfo_t *)

float	get_perm_c_parmetis (SuperMatrix A, int_t perm_r, int_t perm_c, int nprocs_i, int noDomains, int_t sizes, int_t fstVtxSep, gridinfo_t grid, MPI_Comm *metis_comm)

Detailed Description

Gets matrix permutation.

The source code is distributed under BSD license, see the file License.txt at the top-level directory.

-- Distributed symbolic factorization auxialiary routine  (version 2.1) --
Lawrence Berkeley National Lab, Univ. of California Berkeley - July 2003
INRIA France - January 2004
Laura Grigori

November 1, 2007

Function Documentation

◆ a_plus_at_CompRow_loc()

static float a_plus_at_CompRow_loc	(	int	iam,
		int_t *	perm_r,
		int	nprocs_i,
		int_t *	vtxdist_i,
		int_t	n,
		int_t *	rowptr,
		int_t *	colind,
		int	nprocs_o,
		int_t *	vtxdist_o,
		int_t *	p_bnz,
		int_t **	p_b_rowptr,
		int_t **	p_b_colind,
		gridinfo_t *	grid
	)

static

Purpose
=======

Form the structure of Pr*A +A'Pr'. A is an n-by-n matrix in
NRformat_loc format, represented by (rowptr, colind). The output
B=Pr*A +A'Pr' is in NRformat_loc format (symmetrically, also row
oriented), represented by (b_rowptr, b_colind).

The input matrix A is distributed in block row format on nprocs_i
processors.  The output matrix B is distributed in block row format
on nprocs_o processors, where nprocs_o <= nprocs_i.  On output, the
matrix B has its rows permuted according to perm_r.

Sketch of the algorithm
=======================

Let iam by my process number.  Let fst_row, lst_row = m_loc +
fst_row be the first/last row stored on iam.

Compute Pr' - the inverse row permutation, stored in iperm_r.

Compute the transpose  of the block row of Pr*A that iam owns:
   T[:,Pr(fst_row:lst_row)] = Pr' * A[:,fst_row:lst_row] * Pr'


All to all communication such that every processor iam receives all
the blocks of the transpose matrix that it needs, that is
          T[fst_row:lst_row, :]

Compute B = A[fst_row:lst_row, :] + T[fst_row:lst_row, :]

If Pr != I or nprocs_i != nprocs_o then permute the rows of B (that
is compute Pr*B) and redistribute from nprocs_i to nprocs_o
according to the block row distribution in vtxdist_i, vtxdist_o.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ get_perm_c_parmetis()

float get_perm_c_parmetis	(	SuperMatrix *	A,
		int_t *	perm_r,
		int_t *	perm_c,
		int	nprocs_i,
		int	noDomains,
		int_t **	sizes,
		int_t **	fstVtxSep,
		gridinfo_t *	grid,
		MPI_Comm *	metis_comm
	)

Purpose
=======

GET_PERM_C_PARMETIS obtains a permutation matrix Pc, by applying a
graph partitioning algorithm to the symmetrized graph A+A'.  The
multilevel graph partitioning algorithm used is the
ParMETIS_V3_NodeND routine available in the parallel graph
partitioning package parMETIS.  

The number of independent sub-domains noDomains computed by this
algorithm has to be a power of 2.  Hence noDomains is the larger
number power of 2 that is smaller than nprocs_i, where nprocs_i = nprow
* npcol is the number of processors used in SuperLU_DIST.

Arguments
=========

A       (input) SuperMatrix*
        Matrix A in A*X=B, of dimension (A->nrow, A->ncol). The number
        of the linear equations is A->nrow.  Matrix A is distributed
        in NRformat_loc format.

perm_r  (input) int_t*
        Row permutation vector of size A->nrow, which defines the 
        permutation matrix Pr; perm_r[i] = j means row i of A is in 
        position j in Pr*A.

perm_c  (output) int_t*
    Column permutation vector of size A->ncol, which defines the 
        permutation matrix Pc; perm_c[i] = j means column i of A is 
        in position j in A*Pc.

nprocs_i (input) int*
        Number of processors the input matrix is distributed on in a block
        row format.  It corresponds to number of processors used in
        SuperLU_DIST.

noDomains (input) int*, must be power of 2
        Number of independent domains to be computed by the graph
        partitioning algorithm.  ( noDomains <= nprocs_i )

sizes   (output) int_t**, of size 2 * noDomains
        Returns pointer to an array containing the number of nodes
        for each sub-domain and each separator.  Separators are stored 
        from left to right.
        Memory for the array is allocated in this routine.

fstVtxSep (output) int_t**, of size 2 * noDomains
        Returns pointer to an array containing first node for each
        sub-domain and each separator.
        Memory for the array is allocated in this routine.

Return value
============
  < 0, number of bytes allocated on return from the symbolic factorization.
  > 0, number of bytes allocated when out of memory.