v2212: New and improved parallel operation
The GAMG linear solver supports processor agglomeration at the coarsest level using the masterCoarsest processor agglomerator method. This removes communication and increases implicitness, but increases the size of the coarsest level which will degrade performance at higher core counts.
Extend masterCoarsest for multiple master processors
The following table shows timing results obtained when varying the number of master processors. Here, we start with a single master, responsible for 1728 processors, and increase the number of masters to 48, each responsible for 36 processors.
Coarsest level procs | run1 (s) | run2 (s) |
---|---|---|
1 (1728) | 193 | 219 |
2 (864) | 141 | 156 |
4 (432) | 167 | 126 |
8 (216) | 135 | 114 |
16 (108) | 140 | 126 |
48 (36) | 247 | - |
Run 2 shows a strong benefit when using more than one master processor. Note that results are sensitive to cluster configuration, and the cost balance between local computation and communication/explicitness.
The test above was perfomed using complicated dictionary scripting to manually agglomerate processors. In v2212, this has been integrated into the masterCoarsest processor agglomeration using the new nMasters or nProcessorsPerMaster keywords:
{
solver GAMG;
..
processorAgglomerator masterCoarsest;
nCellsInCoarsestLevel 1;
nMasters 2;
}
With debug switches in the system/controlDict
DebugSwitches
{
// Print number of processors per master
masterCoarsest 1;
// Print agglomeration
GAMGAgglomeration 1;
}
we can see the effect of using processor agglomeration on a simple case decomposed onto 17 processors:
masterCoarsest : agglomerating
master procs
0 9 (1 2 3 4 5 6 7 8)
9 8 (10 11 12 13 14 15 16)
GAMGAgglomeration:
local agglomerator : faceAreaPair
processor agglomerator : masterCoarsest
nCells nFaces/nCells nInterfaces nIntFaces/nCells profile
Level nProcs avg max avg max avg max avg max avg
----- ------ --- --- --- --- --- --- --- --- ---
0 17 719 725 1.922 1.926 3.529 5 0.1093 0.1335 1.797e+04
1 17 359 362 1.966 2.109 3.529 5 0.1864 0.2632 7291
2 17 176 181 2.352 2.769 3.529 5 0.272 0.436 2810
3 17 86 90 2.344 2.593 3.529 5 0.4415 0.7738 930.4
4 17 42 44 2.291 2.442 3.529 5 0.6747 1.22 320.9
5 17 20 22 2.094 2.286 3.529 5 0.967 1.895 98.41
6 17 9 11 1.741 2 3.529 5 1.35 2.444 29
7 17 4 5 1.149 1.6 3.529 5 2.082 3.5 6
8 2 15 18 1.585 1.615 1 1 0.5962 0.6923 40
9 2 8 9 1.417 1.5 1 1 0.7083 0.75 15
Tutorials
- $FOAM_TUTORIALS/compressible/rhoSimpleFoam/squareBendLiq (use of masterCoarsest)
Source code
Parallel creation of finite area meshes using the makeFaMesh utility is now more robust. Changes include
- checking for the presence for volume faceProcAddressing prior to creating an equivalent finite area mapping;
- support for decomposition of fields with distributed file roots.