Skip to content

How to store a TB size array in C++ on a cluster

An answer to this question on the Scientific Computing Stack Exchange.

Question

I want to do a huge simulation that requires ~ 1 TB of data to describe a bunch of interacting particles (each has different interactions). Is it possible to store this data in an array in C++, so that I can access it during the simulation?

I have access to a 60 node cluster. Each node has 2 CPUs with 48x16GB DDR4. So that's a total of 192 GB per node, or 11520 GB = 11 TB total on the cluster. How would you dynamically allocate a 1 TB array on this cluster?

Answer

You could try using UPC++, which sets up a globally accessible address space distributed across your nodes.

A more standard approach would be to learn how to use MPI.