Kickoff Meeting Agenda
Scientific Data Management Center (SDM-ISIC)
July 10-11, 2001

 

Tuesday, July 10, 2001

8:30 – 9:00  Continental Breakfast (provided)
9:00 – 9:10  Introductory remarks, Arie Shoshani
9:10 – 9:30  DoE perspective, Fred Johnson
9:30 – 10:15 
Agent technology: Enabling communication among tools and data (ORNL, NCSU)
                      Tom Potok, Mladen Vouk
10:15 – 10:30  Break
10:30 – 12:30 
Area 1: Storage and retrieval of very large datasets
                   1.1 
Optimization of low-level data storage, retrieval and transport (ORNL)
                          Dan Million (for Randy Burris)
                   1.2  Parallel I/O: improving parallel access from clusters (ANL, NWU)
                          Rob Ross

                   1.3  MPI I/O: implementation based on file-level hints (ANL, NWU)

                          Bill Gropp /
Alok Choudhary
                   1.4  Optimizing shared access to tertiary storage (LBNL, ORNL)
                          Arie Shoshani (with Alex Sim)

12:30 – 1:30 Lunch (provided)

1:30 – 3:30  Area 3: Data mining and discovery of access patterns
                  
3.1  Adaptive file caching in a distributed system (LBNL)
                          Ekow Otoo (with Frank Olken)

                   3.2  Dimension reduction and sampling (LLNL)
                          Chandrika Kamath

                   3.3  Multi-agent high-dimensional cluster analysis (ORNL)
                          Nagiza Samatova / George Ostrouchov

                   3.4  Analysis of application level query patterns (LLNL, NWU)
                          Ghaleb Abdulla /
Alok Choudhary
3:30 – 3:45 break
3:45 – 4:45 
Area 2: Access optimization of distributed data
                   2.1  Low level API for Grid I/O (ANL)
                          Rob Ross

                   2.2  High-dimensional indexing techniques (LBNL)
                          John Wu

4:45 – 6:00
Area 4: Access to distributed, heterogeneous data
                   4.1  Multi-tier metadata system for querying heterogeneous data sources (LLNL, Georgia Tech)
                          Terence Chritchlow / Calton Pu
                   4.2  Knowledge-based federation of heterogeneous databases (SDSC)
                          Amarnath Gupta / Bertram Ludäscher

7:00  Dinner at a selected restaurant (by popular vote)

Wednesday, July 11, 2001

8:30 – 9:00  Continental Breakfast (provided)
9:00 – 9:15  Organizational introduction, Arie Shoshani
9:15 – 12:30
4 separate area discussions (includes flexible 15 min break around 10:30)
                   Area 1 Leader: Bill Gropp
                              Members: Dan Million, Alex Sim
                   Area 2 Leader: Alok Choudhary
                              Members: Rob Ross, John Wu, Wei-Keng Liao

                   Area 3 Leader: Nagiza Samatova
                              Members: Chandrika Kamath, Ekow Otoo, Ghaleb Abdulla

                   Area 4 Leader: Terence Critchlow
                              Members: Amarnath Gupta, Bertram Ludäscher, Frank Olken

                  (Tom Potok and Mladen Vouk will join groups they expect to work with.
                   Arie Shoshani will join various groups to check on progress)

12:30 – 1:30  Lunch (provided)
1:00 – 1:30  Oakland Scientific Facility Tour (optional)
1:30 – 3:30 
Reports from area leaders + discussions (up to 30 min each)
            Reports should highlight:
                 Tasks cooperation and integration
                 People involved
                 Application area to be targeted in first year
                 Application contact people and organizations
                 Estimated schedule
3:30 – 4:00  Break
4:00 – 5:30 
General discussion and planning
           Topics:
                 Monitoring and reporting progress in each area

                
Coordination between areas
                 Setup advisory committee
                 Intellectual property
                 Web site

                
CVS repositories
               
Future meetings
5:30  Adjourn

Area discussions reports in PPT files

Area 1: Storage and retrieval of very large datasets
Area 2: Access optimization of distributed data
Area 3: Data mining and discovery of access patterns
Area 4: Access to distributed, heterogeneous data