v2212: New and improved parallel operation

New multiple master processor support for GAMG agglomeration

TOP

The GAMG linear solver supports processor agglomeration at the coarsest level using the masterCoarsest processor agglomerator method. This removes communication and increases implicitness, but increases the size of the coarsest level which will degrade performance at higher core counts.

Extend masterCoarsest for multiple master processors

The following table shows timing results obtained when varying the number of master processors. Here, we start with a single master, responsible for 1728 processors, and increase the number of masters to 48, each responsible for 36 processors.

Coarsest level procs run1 (s) run2 (s)
1 (1728) 193 219
2 (864) 141 156
4 (432) 167 126
8 (216) 135 114
16 (108) 140 126
48 (36) 247 -

Run 2 shows a strong benefit when using more than one master processor. Note that results are sensitive to cluster configuration, and the cost balance between local computation and communication/explicitness.

The test above was perfomed using complicated dictionary scripting to manually agglomerate processors. In v2212, this has been integrated into the masterCoarsest processor agglomeration using the new nMasters or nProcessorsPerMaster keywords:

{
    solver          GAMG;
    ..
    processorAgglomerator   masterCoarsest;
    nCellsInCoarsestLevel   1;
    nMasters                2;
}

With debug switches in the system/controlDict

DebugSwitches
{
    // Print number of processors per master
    masterCoarsest      1;
    // Print agglomeration
    GAMGAgglomeration   1;
}

we can see the effect of using processor agglomeration on a simple case decomposed onto 17 processors:

masterCoarsest : agglomerating
    master  procs
    0       9 (1 2 3 4 5 6 7 8)
    9       8 (10 11 12 13 14 15 16)
GAMGAgglomeration:
    local agglomerator     : faceAreaPair
    processor agglomerator : masterCoarsest

                              nCells       nFaces/nCells         nInterfaces    nIntFaces/nCells     profile
   Level  nProcs         avg     max         avg     max         avg     max         avg     max         avg
   -----  ------         ---     ---         ---     ---         ---     ---         ---     ---         ---
       0      17         719     725       1.922   1.926       3.529       5      0.1093  0.1335   1.797e+04
       1      17         359     362       1.966   2.109       3.529       5      0.1864  0.2632        7291
       2      17         176     181       2.352   2.769       3.529       5       0.272   0.436        2810
       3      17          86      90       2.344   2.593       3.529       5      0.4415  0.7738       930.4
       4      17          42      44       2.291   2.442       3.529       5      0.6747    1.22       320.9
       5      17          20      22       2.094   2.286       3.529       5       0.967   1.895       98.41
       6      17           9      11       1.741       2       3.529       5        1.35   2.444          29
       7      17           4       5       1.149     1.6       3.529       5       2.082     3.5           6
       8       2          15      18       1.585   1.615           1       1      0.5962  0.6923          40
       9       2           8       9       1.417     1.5           1       1      0.7083    0.75          15

Tutorials

Source code

Improved finite area mesh creation

TOP

Parallel creation of finite area meshes using the makeFaMesh utility is now more robust. Changes include

  • checking for the presence for volume faceProcAddressing prior to creating an equivalent finite area mapping;
  • support for decomposition of fields with distributed file roots.