v2112: New and improved parallel operation

New decomposition constraints for AMI patches

TOP

The decomposePar utility offers many options to fine-tune decomposition strategies, controlled using the decomposeParDict file. The preservePatches constraint can be used to keep owner and neighbour cells of coupled patches on the same processor. It previously only handled one-to-one coupling, e.g. for cyclic and processor patches, but has been extended to handle many-to-one coupling, e.g. cyclicAMI and cyclicACMI, for example:

// Optional decomposition constraints
constraints
{
    preservePatches
    {
        type            preservePatches;
        patches         (cyclic0 cyclic1 ami0 ami1);
    }
}

Decomposition without synchronising across cyclicAMI patches shows the destination of each cell on both sides of the cyclicAMI (thick green line):

Decomposition with synchronisation across cyclicAMI patches:

In this last case the coupled cells are on the same processor. For most non-collocated AMI patches the rule will cause all cells on either side to be on a single processor.

Tutorials

Source code

Improved finite area parallel workflows

TOP

Parallel creation of finiteArea meshes

The parallel creation of finiteArea meshes has been reworked to be more robust and use a simpler matching algorithm that addresses processor edge-edge connectivity without the need for any additional ad-hoc handling.

The calculation and exchange of neighbour-neighbour face areas/normals and point corrections from neighbouring patches is now handled centrally as a general halo layer calculation, which properly handles on-processor or off-processor faces irrespective of any explicit processor patches.

Addtional makeFaMesh options

The makeFaMesh utility now has additional -dry-run and -write-vtk options to aid with setup, debugging and diagnosis.

When running makeFaMesh in parallel, it not only creates the finiteArea mesh on top of the existing parallel volume mesh, it also creates procAddressing and decomposes the finiteArea fields in a fashion similar to decomposePar. This is normally necessary since the finiteArea fields would otherwise not be present in parallel. For some workflows it can be preferable to suppress some of these operations.

The additional option -no-decompose option suppress both procAddressing creation and field decomposition, whereas the option -no-fields simply suppresses field decomposition.

Improvements to checkFaMesh

Running checkFaMesh in parallel now provides the expected information. It also has new -write-vtk option, which allows it be used to quickly generate a VTK output file for display and diagnosis of an existing finiteArea mesh.

Threaded writing

TOP

The collated file format was introduced in OpenFOAM v1712 to combine per time step, per field, the output of all processors into a single file. It has the benefit of drastically reducing the number of files, but can also cause the node that performs the writing to become a potential bottleneck. To circumvent the issue the writing can be restricted to a thread such that the main code does not have to wait for the I/O to finish.

To guarantee that the I/O can finish, the thread itself might do MPI communication in case the provided thread buffer size is not large enough. This introduces the problem that the MPI needs to be thread aware. In this version users can specify a negative buffer size to indicate that the provided buffer is large enough to cover any writing. This causes the MPI to be initialised with only the main thread requiring MPI support.

For a 'normal' buffer size specification the log file will e.g. show:

I/O    : collated [threaded] (maxThreadFileBufferSize = 1e+09).
         Requires buffer large enough to collect all data or thread support
         enabled in MPI. If MPI thread support cannot be enabled, deactivate
         threading by setting maxThreadFileBufferSize to 0 in
         OpenFOAM etc/controlDict
Case   : tutorials/incompressible/icoFoam/cavity/cavity

For the negative buffer size specification:

I/O    : collated [threaded] (maxThreadFileBufferSize = -1e+09).

Note that these values can be changed in the etc/controlDict (or the site/user version) or supplied through the command line:

mpirun -np 2 icoFoam -parallel -fileHandler collated -opt-switch maxThreadFileBufferSize=-5E9

These settings cannot be supplied in the case-specific system/controlDict since this is read and interpreted only after parallel setup. In addition, some bugs were addressed that allow the writing to complete after the main code has finished.

Tutorials

Source code