Skip to content

Optimizing n-body simulation

An answer to this question on Stack Overflow.

Question

I'm trying to optimize the n-body algorithm, I have seen that the most expensive function is this:

real3 bodyBodyInteraction(real iPosx, real iPosy, real iPosz, 
                          real jPosx, real jPosy, real jPosz, real jMass)
{
  real rx, ry, rz;
  rx = jPosx - iPosx;
  ry = jPosy - iPosy;
  rz = jPosz - iPosz;
  real distSqr = rx*rx+ry*ry+rz*rz;
  distSqr += SOFTENING_SQUARED;
  
   real s = jMass / POW(distSqr,3.0/2.0); //very expensive
   
  real3 f;
  f.x = rx * s;
  f.y = ry * s;
  f.z = rz * s;
  return f;
}

Using perf record I can see the division is the most expensive instruction and this one have a O(n^2) complexity, but I don't really know how to optimize it.

Answer

Convert

for(int i=0;i<N;i++)
for(int j=0;j<N;j++)

into

for(int i=0;  i<N;i++)
for(int j=i+1;j<N;j++)

Restructure to take advantage of SIMD operators, this can quadruple your throughput.

Use OpenMP to parallelize the loops either across your CPU or by offloading to your GPU (OpenMP 4.5+).

Learn about the Barnes-Hut algorithm, which groups particles to achieve O(N log N) complexity (down from your O(N^2)).