SPH with Dynamic load Balancing

This is just a rework of the SPH Dam break simulation optimized to get better performance we will focus on the optimization and differences with the previous example

See Also: Vector 7 SPH Dam break simulation with Dynamic load balacing

Simulation video 1
Simulation video 2
Simulation dynamic load balancing video 1
Simulation dynamic load balancing video 2
Simulation countour prospective 1
Simulation countour prospective 2
Simulation countour prospective 3

Using verlet list with skin

The first optimization that we operate is the usage of verlet list. The verlet are reconstructed only when the maximum displacement is bigger than the half skin. Because we have to calculate the maximum displacement the verlet and euler integration has been modified to do this. The function accept a reference to max_disp that is filled with the maximum displacement calculated from these functions.

void verlet_int(particles & vd, double dt, double & max_disp)

void euler_int(particles & vd, double dt, double & max_disp)

The variable is reset inside verlet and euler time integration function

max_disp = 0;

while iteration across particle the maximum displacement is saved inside the variable max_disp

        double d2 = dx*dx + dy*dy + dz*dz;
        max_disp = (max_disp > d2)?max_disp:d2;

We also have to be careful that if we are removing particles we have to reconstruct the verlet list, so we set it to a really big number

Because the maximum displacement has to be calculated across processors, we use the function max in Vcluster to calculate the maximum displacement across processors.

    Vcluster & v_cl = create_vcluster();
    v_cl.max(max_disp);
    v_cl.execute();
    max_disp = sqrt(max_disp);

We also calculate the skin part and ghost plus skin. Consider also that the ghost must be extended to ghost + skin so r_gskin

    double skin = 0.25 * 2*H;
    double r_gskin = 2*H + skin;
    // extended boundary around the domain, and the processor domain
    // by the support of the cubic kernel
    Ghost<3,double> g(r_gskin);
    

As we explained before, we update the verlet only if particles move more than the skin. In case they move more than the skin we do first a map to redistribute the particles and in the meanwhile we check if it is a good moment to rebalance. We decided to combine these two steps because in case we rebalance we have anyway to reconstruct the Verler-list. Than we calculate the pressure for all the particles, refresh the ghost, update the Verlet-list and reset the total displacement. In case the the total displacement does not overshoot the skin we just calculate the pressure for all the particles and refresh the ghost. We must use the option SKIP_LABELLING

        it_reb++;
        if (2*tot_disp >= skin)
        {
            vd.remove(to_remove);
            vd.map();
            if (it_reb > 200)
            {
                ModelCustom2 md;
                vd.addComputationCosts(md);
                vd.getDecomposition().redecompose(200);
                vd.map();
                it_reb = 0;
                if (v_cl.getProcessUnitID() == 0)
                    std::cout << "REBALANCED " << std::endl;
            }
            // Calculate pressure from the density
            EqState(vd);
            vd.ghost_get<type,rho,Pressure,velocity>();
            vd.updateVerlet(NN,r_gskin,VL_CRS_SYMMETRIC);
            tot_disp = 0.0;
            if (v_cl.getProcessUnitID() == 0)
                std::cout << "RECONSTRUCT Verlet " << std::endl;
        }
        else
        {
            // Calculate pressure from the density
            EqState(vd);
            vd.ghost_get<type,rho,Pressure,velocity>(SKIP_LABELLING);
        }

We pass the max_displacement variable to verlet_int and euler_int function. We also add the maximum displacement per iteration to the total maximum displacement

        // VerletStep or euler step
        it++;
        if (it < 40)
            verlet_int(vd,dt,max_disp);
        else
        {
            euler_int(vd,dt,max_disp);
            it = 0;
        }
        tot_disp += max_disp;

Symmetric interaction (Crossing scheme)

Symmetric interaction give the possibility to reduce the computation by half and speed-up your simulation. To do this we have to do some changes into the function calc forces. Symmetric interaction require to write on the ghost area. So at the beginning of the function we reset the ghost part. In the meanwhile because we have the external force gravity that operate per particles, we set this force at the beginning.

Warning: The requirement to set per particle external forces outside the particle loop come from the symmetric scheme. Suppose to have in pseudocode this
1 for each particles p

2 reset the force for p

3 for each neighborhood particle q of p

4 calculate the force p-q

5 add the contribution to p

6 add the contribution to q

suppose we are on particle p=0 and calculate the force with q=10 we add the contribution to p and q. Unfortunately accordingly to this cycle when we reach particle q = 10 we reset what we previously calculated. So we have to write
1 for each particles p

2 reset the force for p

3 for each particles p

4 for each neighborhood particle q of p

5 calculate the force p-q

6 add the contribution to p

7 add the contribution to q

With this code we set the per particle external force to gravity and reset the derivative of the density for the domain particles

    // Reset the ghost
    auto itg = vd.getDomainIterator();
    while (itg.isNext())
    {
        auto p = itg.get();
        // Reset force
        // Reset the force counter (- gravity on zeta direction)
        vd.template getProp<force>(p)[0] = 0.0;
        vd.template getProp<force>(p)[1] = 0.0;
        vd.template getProp<force>(p)[2] = -gravity;
        vd.template getProp<drho>(p) = 0.0;
        ++itg;
    }

With this code we reset the force and derivative of the density of the particles on the ghost part

    auto itg2 = vd.getGhostIterator();
    while (itg2.isNext())
    {
        auto p = itg2.get();
        // Reset force
        // Reset the force counter (- gravity on zeta direction)
        vd.template getProp<force>(p)[0] = 0.0;
        vd.template getProp<force>(p)[1] = 0.0;
        vd.template getProp<force>(p)[2] = 0.0;
        vd.template getProp<drho>(p) = 0.0;
        ++itg2;
    }

Small changes must be done to iterate over the neighborhood particles

skip the self interaction

This is instead an important change (and honestly it took quite some hour of debuging to discover the problem). In case we are on boundary particle (p = boundary particle) and calculating an interaction with a particle q = fluid particle we have to remeber that we have also to calculate the force for q (not only drho)

for a fluid particle instead we calculate p-q interaction and we add the contribution to p and q. Because we do not integrate over the boundary particles we can also avoid to check that q is a boundary particle

                        double factor = - ((Pa + Pb) / (rhoa * rhob) + Tensile(r,rhoa,rhob,Pa,Pb) + Pi(dr,r2,v_rel,rhoa,rhob,massb,visc));
                        // Bound - Bound does not produce any change
                        factor = (typea == BOUNDARY && typeb == BOUNDARY)?0.0f:factor;
                        vd.getProp<force>(a)[0] += massb * factor * DW.get(0);
                        vd.getProp<force>(a)[1] += massb * factor * DW.get(1);
                        vd.getProp<force>(a)[2] += massb * factor * DW.get(2);
                        vd.getProp<force>(b)[0] -= massa * factor * DW.get(0);
                        vd.getProp<force>(b)[1] -= massa * factor * DW.get(1);
                        vd.getProp<force>(b)[2] -= massa * factor * DW.get(2);
                        double scal = (v_rel.get(0)*DW.get(0)+v_rel.get(1)*DW.get(1)+v_rel.get(2)*DW.get(2));
                        // Bound - Bound does not produce any change
                        scal = (typea == BOUNDARY && typeb == BOUNDARY)?0.0f:scal;
                        vd.getProp<drho>(a) += massb*scal;
                        vd.getProp<drho>(b) += massa*scal;

After the calculation cycle we have to merge the forces and delta density calculated on the ghost with the real particles.

    vd.template ghost_put<add_,drho,force>();

It is important when we construct our vector of particles to pass the option BIND_DEC_TO_GHOST. To use symmetric calculation in parallel environment the decomposition must be consistent with the cell decomposition of the space.

    particles vd(0,domain,bc,g,BIND_DEC_TO_GHOST);

To construct a Verlet-list using the CRS scheme we use the following function

auto NN = vd.getVerletCrs(r_gskin);

while to update the verlet list we use the following

vd.updateVerlet(NN,r_gskin,VL_CRS_SYMMETRIC);

Using re-decompose instead of decompose

Using redecompose instead of decompose produce less jumping decomposition during the simulation