Challenges (1)
Managing storage resources in an unreliable distributed large heterogeneous system
Long lasting data intensive transactions
- Can’t afford to restart jobs
- Can’t afford to loose data, especially from experiments
Type of failures
- Storage system failures
- Mass Storage System (MSS)
- Disk system
- Server failures
- Network failures