OpenFPM_pdata  4.1.0
Project that contain the implementation of distributed structures
 
Loading...
Searching...
No Matches
Vector 7 SPH Dam break simulation with Dynamic load balacing (Optimized version)

SPH with Dynamic load Balancing

This is just a rework of the SPH Dam break simulation optimized to get better performance we will focus on the optimization and differences with the previous example

See also
Vector 7 SPH Dam break simulation with Dynamic load balacing

Simulation video 1

Simulation video 2
Simulation dynamic load balancing video 1
Simulation dynamic load balancing video 2
Simulation countour prospective 1
Simulation countour prospective 2
Simulation countour prospective 3

Using verlet list with skin

The first optimization that we operate is the usage of verlet list. The verlet are reconstructed only when the maximum displacement is bigger than the half skin. Because we have to calculate the maximum displacement the verlet and euler integration has been modified to do this. The function accept a reference to max_disp that is filled with the maximum displacement calculated from these functions.

void verlet_int(particles & vd, double dt, double & max_disp)
Distributed vector.
void euler_int(particles & vd, double dt, double & max_disp)

The variable is reset inside verlet and euler time integration function

max_disp = 0;

while iteration across particle the maximum displacement is saved inside the variable max_disp

double d2 = dx*dx + dy*dy + dz*dz;
max_disp = (max_disp > d2)?max_disp:d2;

We also have to be careful that if we are removing particles we have to reconstruct the verlet list, so we set it to a really big number

Because the maximum displacement has to be calculated across processors, we use the function max in Vcluster to calculate the maximum displacement across processors.

Vcluster<> & v_cl = create_vcluster();
v_cl.max(max_disp);
v_cl.execute();
max_disp = sqrt(max_disp);
void execute()
Execute all the requests.
void max(T &num)
Get the maximum number across all processors (or reduction with infinity norm)
Implementation of VCluster class.
Definition VCluster.hpp:59

We also calculate the skin part and ghost plus skin. Consider also that the ghost must be extended to ghost + skin so r_gskin

double skin = 0.25 * 2*H;
double r_gskin = 2*H + skin;
// extended boundary around the domain, and the processor domain
// by the support of the cubic kernel
Ghost<3,double> g(r_gskin);

As we explained before, we update the verlet only if particles move more than the skin. In case they move more than the skin we do first a map to redistribute the particles and in the meanwhile we check if it is a good moment to rebalance. We decided to combine these two steps because in case we rebalance we have anyway to reconstruct the Verler-list. Than we calculate the pressure for all the particles, refresh the ghost, update the Verlet-list and reset the total displacement. In case the the total displacement does not overshoot the skin we just calculate the pressure for all the particles and refresh the ghost. We must use the option SKIP_LABELLING

it_reb++;
if (2*tot_disp >= skin)
{
vd.remove(to_remove);
vd.map();
if (it_reb > 200)
{
vd.getDecomposition().redecompose(200);
vd.map();
it_reb = 0;
if (v_cl.getProcessUnitID() == 0)
std::cout << "REBALANCED " << std::endl;
}
// Calculate pressure from the density
EqState(vd);
vd.ghost_get<type,rho,Pressure,velocity>();
vd.updateVerlet(NN,r_gskin,VL_CRS_SYMMETRIC);
tot_disp = 0.0;
if (v_cl.getProcessUnitID() == 0)
std::cout << "RECONSTRUCT Verlet " << std::endl;
}
else
{
// Calculate pressure from the density
EqState(vd);
vd.ghost_get<type,rho,Pressure,velocity>(SKIP_LABELLING);
}
size_t getProcessUnitID()
Get the process unit id.
void updateVerlet(VerletList< dim, St, Mem_type, shift< dim, St > > &ver, St r_cut, size_t opt=VL_NON_SYMMETRIC)
for each particle get the verlet list
void remove(openfpm::vector< size_t > &keys, size_t start=0)
Remove a set of elements from the distributed vector.
void ghost_get(size_t opt=WITH_POSITION)
It synchronize the properties and position of the ghost particles.
void map(size_t opt=NONE)
It move all the particles that does not belong to the local processor to the respective processor.
void addComputationCosts(const self &vd, Model md=Model())
Add the computation cost on the decomposition coming from the particles.
Decomposition & getDecomposition()
Get the decomposition.
Second model for dynamic load balancing.
Definition main.cpp:346

We pass the max_displacement variable to verlet_int and euler_int function. We also add the maximum displacement per iteration to the total maximum displacement

// VerletStep or euler step
it++;
if (it < 40)
verlet_int(vd,dt,max_disp);
else
{
euler_int(vd,dt,max_disp);
it = 0;
}
tot_disp += max_disp;

Symmetric interaction (Crossing scheme)

Symmetric interaction give the possibility to reduce the computation by half and speed-up your simulation. To do this we have to do some changes into the function calc forces. Symmetric interaction require to write on the ghost area. So at the beginning of the function we reset the ghost part. In the meanwhile because we have the external force gravity that operate per particles, we set this force at the beginning.

Warning
The requirement to set per particle external forces outside the particle loop come from the symmetric scheme. Suppose to have in pseudocode this
1 for each particles p
2 reset the force for p
3 for each neighborhood particle q of p
4 calculate the force p-q
5 add the contribution to p
6 add the contribution to q
suppose we are on particle p=0 and calculate the force with q=10 we add the contribution to p and q. Unfortunately accordingly to this cycle when we reach particle q = 10 we reset what we previously calculated. So we have to write
1 for each particles p
2 reset the force for p
3 for each particles p
4 for each neighborhood particle q of p
5 calculate the force p-q
6 add the contribution to p
7 add the contribution to q

With this code we set the per particle external force to gravity and reset the derivative of the density for the domain particles

// Reset the ghost
auto itg = vd.getDomainIterator();
while (itg.isNext())
{
auto p = itg.get();
// Reset force
// Reset the force counter (- gravity on zeta direction)
vd.template getProp<force>(p)[0] = 0.0;
vd.template getProp<force>(p)[1] = 0.0;
vd.template getProp<force>(p)[2] = -gravity;
vd.template getProp<drho>(p) = 0.0;
++itg;
}
vect_dist_key_dx get()
Get the actual key.
vector_dist_iterator getDomainIterator() const
Get an iterator that traverse the particles in the domain.

With this code we reset the force and derivative of the density of the particles on the ghost part

auto itg2 = vd.getGhostIterator();
while (itg2.isNext())
{
auto p = itg2.get();
// Reset force
// Reset the force counter (- gravity on zeta direction)
vd.template getProp<force>(p)[0] = 0.0;
vd.template getProp<force>(p)[1] = 0.0;
vd.template getProp<force>(p)[2] = 0.0;
vd.template getProp<drho>(p) = 0.0;
++itg2;
}
vector_dist_iterator getGhostIterator() const
Get the iterator across the position of the ghost particles.

Small changes must be done to iterate over the neighborhood particles

skip the self interaction

This is instead an important change (and honestly it took quite some hour of debuging to discover the problem). In case we are on boundary particle (p = boundary particle) and calculating an interaction with a particle q = fluid particle we have to remeber that we have also to calculate the force for q (not only drho)

for a fluid particle instead we calculate p-q interaction and we add the contribution to p and q. Because we do not integrate over the boundary particles we can also avoid to check that q is a boundary particle

double factor = - ((Pa + Pb) / (rhoa * rhob) + Tensile(r,rhoa,rhob,Pa,Pb) + Pi(dr,r2,v_rel,rhoa,rhob,massb,visc));
// Bound - Bound does not produce any change
factor = (typea == BOUNDARY && typeb == BOUNDARY)?0.0f:factor;
vd.getProp<force>(a)[0] += massb * factor * DW.get(0);
vd.getProp<force>(a)[1] += massb * factor * DW.get(1);
vd.getProp<force>(a)[2] += massb * factor * DW.get(2);
vd.getProp<force>(b)[0] -= massa * factor * DW.get(0);
vd.getProp<force>(b)[1] -= massa * factor * DW.get(1);
vd.getProp<force>(b)[2] -= massa * factor * DW.get(2);
double scal = (v_rel.get(0)*DW.get(0)+v_rel.get(1)*DW.get(1)+v_rel.get(2)*DW.get(2));
// Bound - Bound does not produce any change
scal = (typea == BOUNDARY && typeb == BOUNDARY)?0.0f:scal;
vd.getProp<drho>(a) += massb*scal;
vd.getProp<drho>(b) += massa*scal;
auto getProp(vect_dist_key_dx vec_key) -> decltype(v_prp.template get< id >(vec_key.getKey()))
Get the property of an element.

After the calculation cycle we have to merge the forces and delta density calculated on the ghost with the real particles.

vd.template ghost_put<add_,drho,force>();

It is important when we construct our vector of particles to pass the option BIND_DEC_TO_GHOST. To use symmetric calculation in parallel environment the decomposition must be consistent with the cell decomposition of the space.

particles vd(0,domain,bc,g,BIND_DEC_TO_GHOST);

To construct a Verlet-list using the CRS scheme we use the following function

auto NN = vd.getVerletCrs(r_gskin);
VerletL getVerletCrs(St r_cut)
for each particle get the symmetric verlet list

while to update the verlet list we use the following

vd.updateVerlet(NN,r_gskin,VL_CRS_SYMMETRIC);

Using re-decompose instead of decompose

Using redecompose instead of decompose produce less jumping decomposition during the simulation