SuperLU Distributed 9.0.0
gpu3d
|
#include "superlu_ddefs.h"
Functions | |
void | dpropagate_A_to_LU3d (dLUstruct_t *LUstruct, int_t *xa, int_t *asub, double *a, superlu_dist_options_t *options, gridinfo3d_t *grid3d, int_t nsupers, float *mem_use) |
int_t | computeLDAspa_Ilsum (int_t nsupers, int_t *ilsum, dLUstruct_t *LUstruct, gridinfo3d_t *grid3d) |
void | propagateDataThroughMatrixBlocks (int_t nsupers, Glu_freeable_t *Glu_freeable, dLUstruct_t *LUstruct, gridinfo3d_t *grid3d, int_t *Urb_length, int_t *rb_marker, int_t *Urb_fstnz, int_t *Ucbs, int **ToSendR, int *ToSendD, int *ToRecv) |
int_t | checkDist3DLUStruct (dLUstruct_t *LUstruct, gridinfo3d_t *grid3d) |
int_t checkDist3DLUStruct | ( | dLUstruct_t * | LUstruct, |
gridinfo3d_t * | grid3d | ||
) |
int_t computeLDAspa_Ilsum | ( | int_t | nsupers, |
int_t * | ilsum, | ||
dLUstruct_t * | LUstruct, | ||
gridinfo3d_t * | grid3d | ||
) |
Function: computeLDAspa_Ilsum
This function computes the Local Distributed Storages (LDS) for the L factor in the LU decomposition. It also updates the input array 'ilsum' to hold the prefix sum of the size of each supernode. Compute ldaspa and ilsum[].
void dpropagate_A_to_LU3d | ( | dLUstruct_t * | LUstruct, |
int_t * | xa, | ||
int_t * | asub, | ||
double * | a, | ||
superlu_dist_options_t * | options, | ||
gridinfo3d_t * | grid3d, | ||
int_t | nsupers, | ||
float * | mem_use | ||
) |
Propagates the new values of A into the existing L and U data structures.
Llu | The local L and U data structures. |
xa | The index array of A. |
asub | The row subscripts of A. |
a | The numerical values of A. |
options | The options array. |
grid | The process grid. |
mem_use | The memory usage. |
void propagateDataThroughMatrixBlocks | ( | int_t | nsupers, |
Glu_freeable_t * | Glu_freeable, | ||
dLUstruct_t * | LUstruct, | ||
gridinfo3d_t * | grid3d, | ||
int_t * | Urb_length, | ||
int_t * | rb_marker, | ||
int_t * | Urb_fstnz, | ||
int_t * | Ucbs, | ||
int ** | ToSendR, | ||
int * | ToSendD, | ||
int * | ToRecv | ||
) |
The function propagate_blocks performs data propagation in the form of values through blocks of a supernodal sparse matrix. It sends values to particular blocks and marks blocks to receive new values depending on certain conditions. It also counts the number of non-zeros in each block row and keeps track of the number of column blocks in each block row.
Input: nsupers: Total number of supernodes, i.e., the number of block columns in the original matrix. grid: The process grid. This grid contains the mapping of the matrix blocks to the process grid. xusub: An array of indices, each pointing to the start of each column in the usub array. usub: An array containing the row indices of non-zero elements of the supernodal matrix. ToSendR: A 2D flag array indicating which blocks need to be sent to other processes, organized by column. ToSendD: A flag array indicating which blocks are to be sent to other processes. Urb_length: An array containing the number of non-zero elements in each block row of the matrix. rb_marker: A marker array to track the last block column visited in each block row. Urb_fstnz: An array containing the number of first non-zero elements in each block row of the matrix. Ucbs: An array counting the number of column blocks in each block row. ToRecv: A flag array indicating which blocks are to be received from other processes.
Output: ToSendR, ToSendD, Urb_length, rb_marker, Urb_fstnz, Ucbs, ToRecv are modified in place. Their new values depend on the conditions checked during the loops.