Kickoff Meeting Agenda Scientific Data Management Center

Kickoff Meeting Agenda
Scientific Data Management Center (SDM-ISIC)
July 10-11, 2001

Tuesday, July 10, 2001

8:30 – 9:00 Continental Breakfast (provided)
9:00 – 9:10 Introductory remarks, Arie Shoshani
9:10 – 9:30 DoE perspective, Fred Johnson
9:30 – 10:15 Agent technology: Enabling communication among tools and data (ORNL, NCSU)
                      Tom Potok, Mladen Vouk
10:15 – 10:30 Break
10:30 – 12:30 Area 1: Storage and retrieval of very large datasets
                   1.1 Optimization of low-level data storage, retrieval and transport (ORNL)
                          Dan Million (for Randy Burris)
                   1.2 Parallel I/O: improving parallel access from clusters (ANL, NWU)
                          Rob Ross
                   1.3 MPI I/O: implementation based on file-level hints (ANL, NWU)
                          Bill Gropp / Alok Choudhary
                   1.4 Optimizing shared access to tertiary storage (LBNL, ORNL)
                          Arie Shoshani (with Alex Sim)
12:30 – 1:30 Lunch (provided)
1:30 – 3:30 Area 3: Data mining and discovery of access patterns
                   3.1 Adaptive file caching in a distributed system (LBNL)
                          Ekow Otoo (with Frank Olken)
                   3.2 Dimension reduction and sampling (LLNL)
                          Chandrika Kamath
                   3.3 Multi-agent high-dimensional cluster analysis (ORNL)
                          Nagiza Samatova / George Ostrouchov
                   3.4 Analysis of application level query patterns (LLNL, NWU)
                          Ghaleb Abdulla / Alok Choudhary
3:30 – 3:45 break
3:45 – 4:45 Area 2: Access optimization of distributed data
                   2.1 Low level API for Grid I/O (ANL)
                          Rob Ross
                   2.2 High-dimensional indexing techniques (LBNL)
                          John Wu
4:45 – 6:00 Area 4: Access to distributed, heterogeneous data
                   4.1 Multi-tier metadata system for querying heterogeneous data sources (LLNL, Georgia Tech)
                          Terence Chritchlow / Calton Pu
                   4.2 Knowledge-based federation of heterogeneous databases (SDSC)
                          Amarnath Gupta / Bertram Ludäscher
7:00 Dinner at a selected restaurant (by popular vote)

Wednesday, July 11, 2001

8:30 – 9:00 Continental Breakfast (provided)
9:00 – 9:15 Organizational introduction, Arie Shoshani
9:15 – 12:30 4 separate area discussions (includes flexible 15 min break around 10:30)
                   Area 1 Leader: Bill Gropp
                              Members: Dan Million, Alex Sim
                   Area 2 Leader: Alok Choudhary
                              Members: Rob Ross, John Wu, Wei-Keng Liao
                   Area 3 Leader: Nagiza Samatova
                              Members: Chandrika Kamath, Ekow Otoo, Ghaleb Abdulla
                   Area 4 Leader: Terence Critchlow
                              Members: Amarnath Gupta, Bertram Ludäscher, Frank Olken
                  (Tom Potok and Mladen Vouk will join groups they expect to work with.
                   Arie Shoshani will join various groups to check on progress)
12:30 – 1:30 Lunch (provided)
1:00 – 1:30 Oakland Scientific Facility Tour (optional)
1:30 – 3:30 Reports from area leaders + discussions (up to 30 min each)
            Reports should highlight:
                 Tasks cooperation and integration
                 People involved
                 Application area to be targeted in first year
                 Application contact people and organizations
                 Estimated schedule
3:30 – 4:00 Break
4:00 – 5:30 General discussion and planning
           Topics:
                 Monitoring and reporting progress in each area
                 Coordination between areas
                 Setup advisory committee
                 Intellectual property
                 Web site
                 CVS repositories
                Future meetings
5:30 Adjourn

Area discussions reports in PPT files

Area 1: Storage and retrieval of very large datasets
Area 2: Access optimization of distributed data
Area 3: Data mining and discovery of access patterns
Area 4: Access to distributed, heterogeneous data