v2412: New and improved parallel operation

Improved Non-Blocking Exchange (NBX)

TOP

In OpenFOAM-v2306 an experimental 'non-blocking consensus exchange' (NBX) option was introduced - see configuration switches

It can have a dramatic effect on extreme core counts with moving meshes. On a case with multiple cyclicAMI on 8192 cores and the entry (in etc/controlDict or local system/controlDict):

OptimisationSwitches
{
    // Additional PstreamBuffers tuning parameters (experimental)
    //    0 : (legacy PEX)
    //        * all-to-all for buffer sizes [legacy approach]
    //        * point-to-point for contents
    //    1 : (hybrid PEX)
    //        * NBX for buffer sizes [new approach]
    //        * point-to-point for contents
    pbufs.tuning    1;
}

Provided the following timings:

NBX Time (s)
No 28k
Yes 18k

Note that NBX does not guarantee that the 'receive and consume' order is the same as the 'send order', meaning the truncation errors may accumulate differently. This is more sensitive for longer messages, e.g. geometry in cyclicAMI, and care must be taken to ensure a unique 'tag' (not the default Pstream::msgType()).

Source code

Merge request

Improved finite-area framework

TOP

Several finite-area framework routines exhibit inconsistent behaviour under parallel operation, affecting planar and non-planar finite-area meshes—regardless of skewness or non-orthogonality. These inconsistencies mainly stem from differing algorithms applied to internal- and processor-edges.

For example, when a corner edge of a non-planar finite-area mesh is shared between two processors, it can introduce a subtle positive/negative perturbation in the tangential direction. Over time, this perturbation may propagate and lead to unexpected flow behaviour, such as film separation.

A series of carefully tested commits improve parallel consistency, mainly to address the following core routines:

makeLPN
makeWeights
makeDeltaCoeffs
makeCorrectionVectors
makeSkewCorrectionVectors

Source code

Merge request

Improved reconstructParMesh

TOP

The utility reconstructParMesh can reconstruct finite-area ('faMesh') meshes, useful when the finite-area mesh is generated in parallel.

Source code

Merge request

Improved distributed tri-surfaces

TOP

The distributedTriSurfaceMesh can now run with a different decomposition method compared to the rest of the simulation, by specifying a decomposition method in the relevant dictionary, e.g. system/snappyHexMeshDict:

box
{
    file "box.obj";
    type distributedTriSurfaceMesh;

    // Override the decomposition method
    numberOfSubdomains  8;
    method              hierarchical;
    n                   (2 2 2);
}

It can optionally run without duplicating triangles by specifying

decomposeUsingBbs   false;

This avoids excessive memory usage, and stores a per-vertex normal and so can be extended to have smooth normals.

With both changes we now recover identical results from running non-parallel or parallel on a simple mesh. In the figure below a slice shows the non-parallel mesh as a blue surface with the parallel mesh as red lines.

This is an intermediate result and helps the memory usage.

Tutorials

Source code

Merge request