Simulations are generating an unprecedented amount of data, facilitated by the rapidly increasing computational capabilities of leading compute resources. This presents significant challenges. One challenge lies in hardware trends: the enormous increases in compute power are not being matched by corresponding increases in bandwidth to storage. Cost and power constrain the feasibility of dramatically larger storage deployments. A second challenge lies in extracting knowledge from these volumes of data. Research in data management infrastructure has created capabilities that can assist in this process, but the available tools are not widely used and deployed. These are not just future challenges, but rather, they are already causing bottlenecks that substantially impact the quality and productivity of scientific research performed with HPC machines.
Leaders: This email address is being protected from spambots. You need JavaScript enabled to view it. (ORNL), This email address is being protected from spambots. You need JavaScript enabled to view it. (ANL)
Team Members: This email address is being protected from spambots. You need JavaScript enabled to view it., ANL This email address is being protected from spambots. You need JavaScript enabled to view it., Rutgers This email address is being protected from spambots. You need JavaScript enabled to view it., ORNL This email address is being protected from spambots. You need JavaScript enabled to view it., NCSU This email address is being protected from spambots. You need JavaScript enabled to view it., GA Tech This email address is being protected from spambots. You need JavaScript enabled to view it., LBNL This email address is being protected from spambots. You need JavaScript enabled to view it., ANL This email address is being protected from spambots. You need JavaScript enabled to view it., GA Tech This email address is being protected from spambots. You need JavaScript enabled to view it., LBNL
Projects: I/O frameworks In Situ Processing and Code Coupling Indexing In Situ Data Compression Parallel I/O and File Formats |
FastBit - Efficient Search Technology for Data Driven Science[View PDF] John Wu, Arie Shoshani (LBNL) Problem Quickly find records satisfying user-specified conditions from a large, complex data set Example: High-energy physics data – from billions of events find collision events with a given energy level and having a specified number of tracks Solution Developed new indexing techniques and a new compression method for the… Read more
FastQuery: providing database capabilities for scientific files[View PDF] K. Wu, S. Byna, A. Shoshani, in collaboration with LBNL Vis group Key Ideas Provide uniform array interface for scientific data in… Read more
Scaling Parallel I/O and Analysis to a Trillion Particles[View PDF] Prabhat (PI), Suren Byna, Oliver Rubel, John Wu, LBNL Objectives Ability to analyze very large datasets quickly to enhance scientific… Read more
ADIOS Visualization Schema for VisIt[View PDF] Dave Pugmire, Gary Liu, Scott Klasky (ORNL) VisIt and ADIOS VisIt: Large data parallel visualization tool Rich set of visualization and… Read more
15% More Accuracy in Seasonal Hurricane Forecasts through Comparative Climate Networks Analytics[View PDF] Nagiza Samatova NCSU/ORNLFred Semazzi NCSU Objectives Develop predictive forecasting methodology for climate extremes (e.g., hurricanes,… Read more
Accelerating Science Input/Output on Leadership Platforms[View PDF] Rob Ross, ANL Objectives Standards-based Input/Output (I/O) interfaces are a cornerstone of DOE science codes The ROMIO MPI-IO… Read more
Visualization for Geo Sciences[View PDF] Dave Pugmire, ORNL VisIt and ADIOS VisIt: Large data parallel visualization tool Rich set of visualization and analysis plots and… Read more
Visualization Support for Fusion[View PDF] Dave Pugmire, ORNL VisIt and ADIOS VisIt: Large data parallel visualization tool Rich set of visualization and analysis plots and… Read more
DIY: Enabling large-scale data-parallel analysis[View PDF] Tom Peterka, ANL Main Ideas and Objectives Decouple analysis technique (user) from data-intensive parallelism (DIY) Enable large-scale… Read more
Scalable In-Memory Data Indexing and Querying for Scientific Simulation Workflows[View PDF] Manish Parashar, Rutgers Target Application S3D combustion simulation (Jacqueline Chen and Hemanth Kolla, Sandia National Laboratory)… Read more
In situ code coupling and analysis - an essential capability for advanced large scale simulations[View PDF] Manish Parashar, Rutgers Objectives Provide tools for online and In-situ data analytics E.g. visualization, feature tracking Enable… Read more
Facilitating In-Situ Analytics for Complex AMR-based Simulation Workflows[View PDF] Manish Parashar, Rutgers Objective Manage dynamic data processing requirements at extreme scales using coordinated algorithm, middleware… Read more
Darshan: Improving I/O performance for scientific applications[View PDF] Robert Ross, ANL Application Darshan collects concise I/O access pattern information from large-scale applications Goal Users: improve the… Read more
Big Data Means Big Issues for Exascale VisualizationDOE ASCR DISCOVERY – New Faces Posted August 8, 2012 When exascale computers begin calculating at a billion, billion operations each second, gaining… Read more
I/O bottlenecks and analysis challenges faced by applications running on leadership systems[View PDF] Visualization of Type 1A supernova explosion FLASH simulation • FLASH is multi-scale, multi-physics code used in domains including… Read more
|