DOE funded data intensive applications have generated the need for novel data transfer technologies and automated tools that are capable of effectively utilizing available raw network bandwidth and intelligently assisting scientists in replicating a huge volume of data to any desired location in a timely manner. We propose to design and develop an integrated end-to-end resource provisioning and management system for high performance data transfer framework to leverage heterogeneous network protocols and storage types in a federated computing environment.
Our proposed data transfer framework will provide the capability of predicable, yet efficient delivery of terabits/second data transfer throughput for data intensive applications. The framework is to be based on a layered architecture: data plane, control plane, and management plane to incorporate functional modules: resource co-scheduling, storage resource management, network resource provisioning, data transfer, security, monitoring and problem diagnosis, each of which can be flexibly added and customized in a “plug- and-play” fashion only when needed. We plan to leverage the existing Storage Resource Manager (SRM) to manage heterogeneous storage types, including disk, tape, and Peer-to-Peer data storage, and enable it to interact with the existing DOE TeraPaths and OSCARS projects to utilize the high performance dynamic circuit capability provided by DOE’s Science Data Network (SDN) and terabit LAN, with the goal of supporting end to end predicable data transfer with quality of services.
The research challenge is to develop analytical tools and efficient approaches for joint allocation and co- scheduling of well-balanced network and storage resources involved in data transfer. The proposed product and research outcomes will play a transformative role to bridge end-to-end advanced storage and network technologies with science applications in a transparent way.