|
|
|
|
HPM Counter Statistics | ||||
---|---|---|---|---|
Event | Ntasks | Avg | Min(rank) | Max(rank) |
PAPI_FP_OPS | * | 2792744405.09 | 2742585927 (1415) | 2862741526 (1126) |
PAPI_L1_DCA | * | 56783602556.01 | 54114024590 (1296) | 59171107496 (137) |
PAPI_L1_DCM | * | 224034050.27 | 207829366 (672) | 269387946 (536) |
PAPI_TOT_INS | * | 107910086478.14 | 102439828262 (1296) | 112613979911 (1206) |
Communication Event Statistics (100.00% detail, -1.0312e-03 error) | |||||||
---|---|---|---|---|---|---|---|
Buffer Size | Ncalls | Total Time | Min Time | Max Time | %MPI | %Wall | |
MPI_Send | 8192 | 28973544 | 3859.437 | 3.815e-06 | 1.326e-01 | 19.79 | 5.77 |
MPI_Waitsome | 0 | 3546014 | 2191.003 | 0.000e+00 | 1.367e-01 | 11.23 | 3.27 |
MPI_Allreduce | 8 | 897939 | 1892.287 | 6.104e-05 | 8.507e-02 | 9.70 | 2.83 |
MPI_Waitall | 3072 | 2535933 | 1395.306 | 0.000e+00 | 3.828e-02 | 7.15 | 2.09 |
MPI_Allgather | 8 | 105408 | 1307.322 | 1.900e-04 | 1.643e-01 | 6.70 | 1.95 |
MPI_Alltoall | 4 | 186624 | 1132.991 | 7.212e-04 | 6.330e-02 | 5.81 | 1.69 |
MPI_Waitall | 64 | 257323 | 871.251 | 0.000e+00 | 6.937e-02 | 4.47 | 1.30 |
MPI_Alltoallv | 0 | 186624 | 843.519 | 2.992e-03 | 1.781e-02 | 4.32 | 1.26 |
MPI_Send | 32768 | 462952 | 746.445 | 5.317e-05 | 5.256e-02 | 3.83 | 1.12 |
MPI_Waitall | 49152 | 2886000 | 606.522 | 0.000e+00 | 1.325e-02 | 3.11 | 0.91 |
MPI_Waitall | 12288 | 1490000 | 525.927 | 0.000e+00 | 1.614e-02 | 2.70 | 0.79 |
MPI_Send | 16384 | 443200 | 390.104 | 8.106e-06 | 1.197e-01 | 2.00 | 0.58 |
MPI_Waitall | 40960 | 1731600 | 365.544 | 0.000e+00 | 1.264e-02 | 1.87 | 0.55 |
MPI_Waitall | 10240 | 894000 | 317.173 | 0.000e+00 | 1.641e-02 | 1.63 | 0.47 |
MPI_Waitall | 768 | 1342341 | 269.910 | 0.000e+00 | 1.126e-02 | 1.38 | 0.40 |
MPI_Waitall | 2560 | 823014 | 234.910 | 0.000e+00 | 2.489e-02 | 1.20 | 0.35 |
MPI_Waitall | 32768 | 427160 | 197.542 | 0.000e+00 | 1.158e-02 | 1.01 | 0.30 |
MPI_Waitall | 640 | 832314 | 186.003 | 0.000e+00 | 1.745e-02 | 0.95 | 0.28 |
MPI_Send | 327680 | 80288 | 150.452 | 5.889e-04 | 5.592e-02 | 0.77 | 0.22 |
MPI_Allgather | 4 | 32832 | 146.807 | 4.520e-04 | 1.574e-02 | 0.75 | 0.22 |
MPI_Allreduce | 4 | 24192 | 146.597 | 2.551e-04 | 3.702e-02 | 0.75 | 0.22 |
MPI_Recv | 4 | 1696 | 129.968 | 1.554e-01 | 1.623e-01 | 0.67 | 0.19 |
MPI_Send | 98304 | 214536 | 105.276 | 4.411e-05 | 5.389e-02 | 0.54 | 0.16 |
MPI_Allreduce | 24 | 32832 | 105.272 | 1.471e-03 | 1.495e-02 | 0.54 | 0.16 |
MPI_Send | 12288 | 373120 | 99.822 | 1.383e-05 | 3.449e-02 | 0.51 | 0.15 |
MPI_Send | 2048 | 14891088 | 93.115 | 0.000e+00 | 4.969e-03 | 0.48 | 0.14 |
MPI_Send | 40960 | 123672 | 85.308 | 7.200e-05 | 2.370e-02 | 0.44 | 0.13 |
MPI_Send | 8 | 11135904 | 81.815 | 0.000e+00 | 1.287e-02 | 0.42 | 0.12 |
MPI_Send | 256 | 10814080 | 81.300 | 0.000e+00 | 1.009e-02 | 0.42 | 0.12 |
MPI_Waitall | 3584 | 149149 | 73.855 | 1.693e-05 | 2.904e-02 | 0.38 | 0.11 |
MPI_Waitall | 8192 | 205152 | 67.052 | 0.000e+00 | 1.680e-02 | 0.34 | 0.10 |
MPI_Send | 512 | 15853220 | 63.989 | 0.000e+00 | 9.093e-03 | 0.33 | 0.10 |
MPI_Send | 128 | 16924280 | 59.621 | 0.000e+00 | 1.217e-02 | 0.31 | 0.09 |
MPI_Waitall | 1024 | 149149 | 59.595 | 2.146e-06 | 1.883e-02 | 0.31 | 0.09 |
MPI_Waitall | 2048 | 194892 | 56.425 | 0.000e+00 | 4.558e-02 | 0.29 | 0.08 |
MPI_Send | 81920 | 114952 | 55.126 | 4.792e-05 | 1.764e-02 | 0.28 | 0.08 |
MPI_Waitall | 448 | 151533 | 48.462 | 4.053e-06 | 3.931e-03 | 0.25 | 0.07 |
MPI_Send | 64 | 3319062 | 48.439 | 0.000e+00 | 6.273e-02 | 0.25 | 0.07 |
MPI_Waitall | 512 | 182196 | 38.937 | 0.000e+00 | 1.854e-02 | 0.20 | 0.06 |
MPI_Waitall | 896 | 90294 | 34.453 | 0.000e+00 | 1.802e-02 | 0.18 | 0.05 |
MPI_Irecv | 8192 | 28973544 | 33.641 | 0.000e+00 | 5.764e-03 | 0.17 | 0.05 |
MPI_Send | 20480 | 64344 | 32.909 | 6.795e-05 | 5.279e-02 | 0.17 | 0.05 |
MPI_Send | 32 | 5420332 | 27.147 | 0.000e+00 | 2.045e-02 | 0.14 | 0.04 |
MPI_Send | 16 | 4023648 | 24.480 | 0.000e+00 | 8.624e-03 | 0.13 | 0.04 |
MPI_Waitall | 320 | 90294 | 24.142 | 3.099e-06 | 3.639e-03 | 0.12 | 0.04 |
MPI_Send | 2560 | 1534168 | 13.607 | 9.537e-07 | 5.040e-03 | 0.07 | 0.02 |
MPI_Bcast | 8 | 10368 | 12.908 | 6.709e-04 | 2.967e-03 | 0.07 | 0.02 |
MPI_Barrier | 0 | 6912 | 10.448 | 1.027e-03 | 3.766e-03 | 0.05 | 0.02 |
MPI_Gatherv | 0 | 7560 | 10.051 | 6.430e-04 | 2.545e-03 | 0.05 | 0.02 |
MPI_Reduce | 8 | 53568 | 9.958 | 0.000e+00 | 2.173e-02 | 0.05 | 0.01 |
MPI_Waitall | 4096 | 3874 | 9.907 | 3.815e-06 | 6.643e-02 | 0.05 | 0.01 |
MPI_Irecv | 2048 | 14891088 | 8.258 | 0.000e+00 | 1.223e-03 | 0.04 | 0.01 |
MPI_Waitall | 16384 | 62952 | 8.205 | 0.000e+00 | 8.803e-03 | 0.04 | 0.01 |
MPI_Allreduce | 81920 | 1728 | 6.956 | 3.955e-03 | 4.849e-03 | 0.04 | 0.01 |
Load balance by task: HPM counters |
---|
by MPI rank, by MPI time |
Load balance by task: memory, flops, timings |
by MPI rank, by MPI time |
Communication balance by task (sorted by MPI time) |
by MPI rank , time detail by MPI time , time detail by rank , call list |
Message Buffer Size Distributions: time |
|
Message Buffer Size Distributions: Ncalls |
|
Communication Topology : point to point data flow |
|
Switch Traffic (volume by node) |
|
Memory usage by node |
|