As simulations approach petascale and beyond, there is a clear need to analyze and reduce the data produced before writing it to storage. Automatic analysis enables such data triage and reduction to prepare it for subsequent I/O and in-depth analysis and visualization. Built on existing data mining and analysis techniques, we research, develop, and deploy application-driven, architecture-aware techniques for performing in situ data analysis, filtering, and reduction to help minimize and optimize downstream I/O tasks while enabling a flexible post-processing analysis. This results in a collection of scalable libraries to be adopted by the community at large. In our previous SciDAC work, we successfully transitioned data mining and analysis techniques into tools like VisIt and ParaView while parallelizing several key operations. In SciDAC-3 we continue to deploy new capabilities using a flexible software infrastructure consisting of several lightweight libraries. Our design and implementation exploits both advanced multi-core and many-core architectures. Overall, this constitutes an in situ data analysis framework that is adaptable to a large variety of data representations, such as regular grids, block-regular structures, AMR models, or unstructured meshes and ultimately respond directly to the needs of the users.

 

Leaders:

This email address is being protected from spambots. You need JavaScript enabled to view it. (Utah), This email address is being protected from spambots. You need JavaScript enabled to view it. (UCD)

 

Team Members:

This email address is being protected from spambots. You need JavaScript enabled to view it., LBNL

This email address is being protected from spambots. You need JavaScript enabled to view it., U. Utah

This email address is being protected from spambots. You need JavaScript enabled to view it., LBNL

This email address is being protected from spambots. You need JavaScript enabled to view it., NWU

This email address is being protected from spambots. You need JavaScript enabled to view it., U. Utah

This email address is being protected from spambots. You need JavaScript enabled to view it., UC Davis

This email address is being protected from spambots. You need JavaScript enabled to view it., ANL

This email address is being protected from spambots. You need JavaScript enabled to view it., NCSU

This email address is being protected from spambots. You need JavaScript enabled to view it., ORNL

This email address is being protected from spambots. You need JavaScript enabled to view it., LBNL

 

Projects:

Statistical and Data Mining Techniques

Importance-Driven Analysis Techniques

Topological Methods

Vector Field Analysis

Feature-Driven Data Exploration

DIY Block

DIY Block-Parallel Data Analysis

[View PDF] Dmitriy Morozov (LBNL) Tom Peterka (ANL) Objectives DIY is a programming model and runtime for block-parallel analytics on DOE leadership machines. Its main abstraction is block parallelism: all parallel operations and communications are expressed in terms of blocks, not processors. This enables the same program to run in- and out-of-core with single or… Read more
Bitmap representation of cassette  properties, and finding matching combinations

Gene Context Analysis Now Performed in Seconds

[View PDF] A. RomosanA. ShoshaniK. WuV. MarkowitzK. Mavrommatis The figure shows the bitmap representation of cassette properties, and finding… Read more
Multivariate combustion data exploration

Multivariate Data Analysis Made Easy

[View PDF] Ayan BiswasSumya DuttaHan-Wei Shen (OSU)Jonathan Woodring (LANL) Objective Analyze large scale multivariate scientific data sets… Read more
Cloud rendering on a virtual globe

Analysis and Visualization of Madden-Julian Oscillation (MJO)

[View PDF] Han-Wei Shen, OSU Application Climate modeling by Dr. Ruby Leung and Dr. Samson Hagos at PNNL Simulation Goal: Understand MJO A complex… Read more
Pixie3D I/O Processing Pipeline

Flexible In-Situ Analytics

[View PDF] Karsten Schwan, Greg Eisenhauer, Matt Wolf, and Ph.D. studentsGeorgia Institute of Technology, CERCS Research Center Applications GTC,… Read more
Pascucci.Utah.Ana.Topology.turbulance-fig1

Topological and Statistical Analytics of Turbulent Combustion

[View PDF] Valerio Pascucci, U-Utah Application Turbulent Combustion by Dr. Jackie Chen at SNL, and Dr. John Bell at LBNL Goal: understanding… Read more
(Left) Presentation of a feature selected in 3D. (Right) Corresponding tracking graph. The color selection (red) used on the feature is used to highlight its time evolution on the graph.

Time-Varying Data Analysis with Time Activity Curves

[View PDF] Valerio Pascucci, U-Utah Technology Robust analysis based on topological definitions Fast parallel evaluation of dependent statistics… Read more
VPIC Strong Scaling with and without In-Situ Visualization

Significant Efficiency Increase in Scientific Workflow Through In Situ Analysis with ParaView Catalyst

[View PDF] Chris Sewell, Jim Ahrens, LANLBerk Geveci, Patrick O’Leary, Kitware Inc. Objectives Disk I/O has become a significant bottleneck for large… Read more
Peterka.ANL.Ana.Cosmology-fig1

Meshing the Universe: In Situ Voronoi and Delaunay Tessellation

[View PDF] Tom Peterka, Juliana Kwan, Adrian Pope, Hal Finkel, Katrin Heitmann, Salman Habib, ANLGeorge Zagaris, Berk Geveci, KitwareWei-keng Liao,… Read more
Predictions of North Atlantic hurricane paths

Refining hurricane forecasts by connecting the dots

ASCR DISCOVERY – At the Universities Posted June 20, 2012 By graphing points in Earth's roiling atmosphere with the aid of the latest petascale-power… Read more
Parallel Distance Field Computing

Parallel Distance Field Computing

[View PDF] Kwan-Liu Ma, UC Davis Depiction of a distance field computed from a feature surface in data generated by a combustion simulation.… Read more
Glean

Glean

[View PDF] GLEAN is a flexible and extensible framework to facilitate simulation-time data analysis and I/O acceleration. Features include:… Read more