Investigators: Suren Byna (LBNL), P. Carns (ANL)
Program Manager: Lucy Nowell
Using Big Data analytics algorithms to analyze the tens of terabytes produced by trillion particle scale cosmology and plasma physics simulations needs high-performance data read/write (I/O) functions, where the time spent in I/O must be minimized. LBNL researchers developed I/O methods and optimizations to drive the I/O time down to a minimum.
Clusters identified in 1.4 trillion particle space weather simulation: Spatial distribution of clusters identified by the clustering algorithm developed in this work. The high-density clusters are mainly localized within the current sheet and appear as narrow structures elongated along the direction of local magnetic field. Scientists interpret that the particles comprising the clusters have been accelerated in a process where they gain a fixed amount of energy in a relatively narrow region of space. This observation helped the scientists understand principles of plasma interactions in space weather. (Image Credit: V. Roytershteyn, LANL/SSI)
Researchers achieved near-peak I/O performance on NERSC file systems, enabling DBScan & K-nearest neighbor algorithms scale to 100,000 cores and leading to first-of-a-kind data analysis and visualizations that helped scientists understand principles of space weather.