v2412: New and improved parallel operation
In OpenFOAM-v2306 an experimental 'non-blocking consensus exchange' (NBX) option was introduced - see configuration switches
It can have a dramatic effect on extreme core counts with moving meshes. On a case with multiple cyclicAMI on 8192 cores and the entry (in etc/controlDict or local system/controlDict):
OptimisationSwitches
{
// Additional PstreamBuffers tuning parameters (experimental)
// 0 : (legacy PEX)
// * all-to-all for buffer sizes [legacy approach]
// * point-to-point for contents
// 1 : (hybrid PEX)
// * NBX for buffer sizes [new approach]
// * point-to-point for contents
pbufs.tuning 1;
}
Provided the following timings:
| NBX | Time (s) |
| No | 28k |
| Yes | 18k |
Note that NBX does not guarantee that the 'receive and consume' order is the same as the 'send order', meaning the truncation errors may accumulate differently. This is more sensitive for longer messages, e.g. geometry in cyclicAMI, and care must be taken to ensure a unique 'tag' (not the default Pstream::msgType()).
Source code
Merge request
Several finite-area framework routines exhibit inconsistent behaviour under parallel operation, affecting planar and non-planar finite-area meshes—regardless of skewness or non-orthogonality. These inconsistencies mainly stem from differing algorithms applied to internal- and processor-edges.
For example, when a corner edge of a non-planar finite-area mesh is shared between two processors, it can introduce a subtle positive/negative perturbation in the tangential direction. Over time, this perturbation may propagate and lead to unexpected flow behaviour, such as film separation.
A series of carefully tested commits improve parallel consistency, mainly to address the following core routines:
makeLPN
makeWeights
makeDeltaCoeffs
makeCorrectionVectors
makeSkewCorrectionVectors
Source code
- $FOAM_SRC/finiteArea/faMesh/faMeshDemandDrivenData.C
- $FOAM_SRC/finiteArea/faMesh/faPatches/constraint/processor/processorFaPatch.C
Merge request
The utility reconstructParMesh can reconstruct finite-area ('faMesh') meshes, useful when the finite-area mesh is generated in parallel.
Source code
Merge request
The distributedTriSurfaceMesh can now run with a different decomposition method compared to the rest of the simulation, by specifying a decomposition method in the relevant dictionary, e.g. system/snappyHexMeshDict:
box
{
file "box.obj";
type distributedTriSurfaceMesh;
// Override the decomposition method
numberOfSubdomains 8;
method hierarchical;
n (2 2 2);
}
It can optionally run without duplicating triangles by specifying
decomposeUsingBbs false;
This avoids excessive memory usage, and stores a per-vertex normal and so can be extended to have smooth normals.
With both changes we now recover identical results from running non-parallel or parallel on a simple mesh. In the figure below a slice shows the non-parallel mesh as a blue surface with the parallel mesh as red lines.
This is an intermediate result and helps the memory usage.
Tutorials
Source code
Merge request

