AMASE: Architecture and Management for Autonomic Science Ecosystems
ANL: Pete Beckman, Raj Kettimuthu, Rajesh Sankaran, Zhengchun Liu
LBNL: Alex Sim, John Wu
Northwestern: Alok Choudhary, Wei-Keng Liao, Ankit Agrawal, Qiao Kang
Goals
- Landscape: Science Internet of Things
- Improve access, performance, and utilization of scientific resources
- Reduce complexity and autonomically tune and optimize
- Smart, self-aware scientific resources
- Design a scalable architecture for smart science ecosystems.
- Embed intelligence in relevant sub-systems via light-weight machine learning.
- Explore methods for distributed and autonomous management of the systems.
Benefits
- Scientists using DOE computing infrastructure will be able to run workflows on automatically selected resources that are dynamically configured and tuned for their application.
- Facility and network operators will have the ability to predict and diagnose problems before they cause downtime.
Challenges
- Streaming data analysis
- continuous and increasing volume of data which often has the complex representation and heterogeneity
- variability in multivariate features
- high frequency of the data collection
- Streaming data including network traffic monitoring measurements show the non-linear property, and the dimensionality reduction is a challenge.
- Deep learning models would be essential to find spatio-temporal variations in data streams with multiple features.
Summary
- Novel analytical approaches towards autonomous systems
- Understanding of the variability in multivariate features
- Handling high frequency of the streaming data collection
- For analysis, event classification and prediction.
- Newly developed analysis methods
- Learn spatio-temporal relationship on streaming data
- Reveal transitions in network operation states
- Offer a new way to classify and predict the anomalous state
- Understand variations in patterns for data with multiple features
- Traditionally, individual features are analyzed independently
- Machine learning-based clustering method
- Density-based grid structure and joint distribution methods
- New similarity measures to estimate temporal variations
- Degree of change: based on moving distance of clusters
- Common occupancy rate: similarity based on the concept of Jaccard Index for grid structure.