Prabhat (PI), Suren Byna, Oliver Rubel, John Wu, LBNL
Objectives
- Ability to analyze very large datasets quickly to enhance scientific understanding and discovery
- Enhance I/O on HDF5, a popular file format used by many application domains
- Demonstrate these capabilities on Trillion particle plasma physics simulation
Accomplishments
- Trillion particle plasma physics simulation conducted on 120,000 cores @NERSC
- Enhanced Parallel HDF5 obtained peak 35GB/s, and 80% sustained I/O rate
- FastBit was used to index 30TB timestep in 10 minutes and query in 3 seconds
Impact
- Software enabled scientists to search and gain insights from the trillion particle dataset for the first time:
- Confinement of energetic particles by the flux ropes
- Asymmetric distribution of particles near the reconnection hot-spot
Magnetic reconnection from a plasma physics simulation (Left). Scientists were able to query and find an asymmetric distribution of particles near the reconnection event (Right) using our software tools. |
Surendra Byna et al, "Parallel I/O, Analysis, and Visualization of a Trillion Particle Simulation". SuperComputing conference, SC’12, November 2012.
Notes:
The slide highlights recent accomplishments from the ExaHDF5 project with collaboration with SDAV staff.
1) Parallel I/O with HDF5
We ran a Trillion particle simulation on 120K cores on hopper. The code produced 30 TB of particle data *per timestep*- To the best of our knowledge, this is the first time that anyone has demonstrated writes to a single, shared 30 TB HDF5 file
- We hit peak I/O rates on hopper (~35GB/s) for brief time intervals during the run, we made an average ~23GB/s, which is a new record for parallel HDF5 performance
2) FastBit based analysis-
- We developed a novel hybrid parallel version of FastBit to do the indexing/querying on the dataset
- This was the first time that we used FastBit and FastQuery to index and query a dataset with Trillion entries
- We were able to index the dataset in 10 minutes and query the dataset in 3 seconds
3) Scientific insights
- This is the first time that our science collaborators have been able to examine the trillion particle dataset. They had largely ignored the particle data, or looked at a coarse grained version earlier
- Our collaborators had made a number of conjectures and hypothesis regarding the interplay between particles and the magnetic fields and multi-dimensional phase-space distribution of particles. Using these new tools, they were able to confirm these hypothesis quantitatively. More specifically the scientists found:
- a preferential acceleration of particles in a direction parallel to the magnetic field
- energetic particles carrying a significant amount of current, even at early timesteps in the simulation
- predominant distribution of energetic particles in the current sheet, suggesting that flux ropes can confine these particles
- agyrotropic (asymmetric) distribution of particles near the magnetic reconnection event
DOE researchers: Prabhat (PI), Suren Byna, Oliver Rubel and John Wu (LBNL)
Scientific collaborators: Homa Karimabadi (UCSD), Vadim Roytershteyn (UCSD) and Bill Daughton (LANL)
Simulation code used in the study is VPIC, developed at LANL.