The Storage Resource Manager
Interface Specification

Version 2.2

 

 

24 May 2008

 

 

Collaboration Web:

http://sdm.lbl.gov/srm-wg

Document Location:

http://sdm.lbl.gov/srm-wg/doc/SRM.v2.2.html

 

Editors:

Alex Sim

Lawrence Berkeley National Laboratory

Arie Shoshani

Lawrence Berkeley National Laboratory

 

Authors/Contributors:

Paolo Badino

Olof Barring

Jean-Philippe Baud

Flavia Donno

Maarten Litmaath

European Organization for Nuclear Research (CERN), Switzerland

Timur Perelmutov

Don Petravick

Fermi National Accelerator Laboratory (FNAL), USA

Ezio Corso

Luca Magnoni

International Centre for Theoretical Physics (ICTO), Italy

Istituto Nazionale di Fisica Nucleare (INFN), Italy

Junmin Gu

Lawrence Berkeley National Laboratory (LBNL), USA

Shaun De Witt

Jens Jensen

Rutherford Appleton Laboratory (RAL), England

Michael Haddox-Schatz

Bryan Hess

Andy Kowalski
Chip Watson

Thomas Jefferson National Accelerator Facility (TJNAF), USA



This document is modified from OGF.129 for more readability towards funcional interfaces. OGF.129 document fixes many typos and includes some clarifications from the previous version (2 April 2007).

 

 


Table of Contents

 

Abstract 5

Introduction. 5

1. Common Type Definitions 7

1.1. Meaning of terms. 7

1.2. File Storage Type. 7

1.3. File Type. 8

1.4. Retention Policy. 8

1.5. Access Latency. 8

1.6. Permission Mode. 8

1.7. Permission Type. 9

1.8. Request Type. 9

1.9. Overwrite Mode. 9

1.10. File Locality. 9

1.11. Access Pattern.. 10

1.12. Connection Type. 10

1.13. Status Codes. 10

1.14. Retention Policy Info. 11

1.15. Request Token.. 11

1.16. User Permission.. 11

1.17. Group Permission.. 11

1.18. Size in Bytes. 11

1.19. UTC Time. 12

1.20. Time in Seconds (Lifetime and RequestTime). 12

1.21. SURL. 12

1.22. TURL. 12

1.23. Return Status. 12

1.24. Return Status for SURL. 12

1.25. File MetaData. 13

1.26. Space MetaData. 13

1.27. Directory Option.. 14

1.28. Extra Info. 14

1.29. Transfer Parameters. 14

1.30. File Request for srmPrepareToGet. 15

1.31. File Request for srmPrepareToPut. 15

1.32. File Request for srmCopy. 15

1.33. Return File Status for srmPrepareToGet. 15

1.34. Return File Status for srmBringOnline. 15

1.35. Return File Status for srmPrepareToPut. 15

1.36. Return File Status for srmCopy. 16

1.37. Request Summary. 16

1.38. Return Status for SURL. 16

1.39. Return File Permissions. 17

1.40. Return Permissions on SURL. 17

1.41. Return Request Tokens. 17

1.42. Supported File Transfer Protocol 17

2. Space Management Functions 18

2.1. srmReserveSpace. 18

2.2. srmStatusOfReserveSpaceRequest. 20

2.3. srmReleaseSpace. 21

2.4. srmUpdateSpace. 23

2.5. srmStatusOfUpdateSpaceRequest. 24

2.6. srmGetSpaceMetaData. 25

2.7. srmChangeSpaceForFiles. 26

2.8. srmStatusOfChangeSpaceForFilesRequest. 29

2.9. srmExtendFileLifeTimeInSpace. 31

2.10. srmPurgeFromSpace. 32

2.11. srmGetSpaceTokens. 34

3. Permission Functions 36

3.1. srmSetPermission.. 36

3.2. srmCheckPermission.. 37

3.3. srmGetPermission.. 38

4. Directory Functions 40

4.1. srmMkdir. 40

4.2. srmRmdir. 40

4.3. srmRm... 41

4.4. srmLs. 43

4.5. srmStatusOfLsRequest. 45

4.6. srmMv. 47

5. Data Transfer Functions 49

5.1. srmPrepareToGet. 49

5.2. srmStatusOfGetRequest. 52

5.3. srmBringOnline. 55

5.4. srmStatusOfBringOnlineRequest. 59

5.5. srmPrepareToPut. 61

5.6. srmStatusOfPutRequest. 65

5.7. srmCopy. 68

5.8. srmStatusOfCopyRequest. 72

5.9. srmReleaseFiles. 75

5.10. srmPutDone. 77

5.11. srmAbortRequest. 78

5.12. srmAbortFiles. 80

5.13. srmSuspendRequest. 81

5.14. srmResumeRequest. 82

5.15. srmGetRequestSummary. 82

5.16. srmExtendFileLifeTime. 84

5.17. srmGetRequestTokens. 86

6. Discovery Functions 87

6.1. srmGetTransferProtocols. 87

6.2. srmPing. 87

7. Storage Resource Managers Concepts 89

7.1. Summary. 89

7.2. Overview.. 89

7.3. The Basic Concepts. 90

7.4. Additional concepts introduced with v2.2.. 92

7.5. SRM Implementations. 94

8. Appendix I : Current SRM Implementations Based on v2.2 specification. 96

8.1. BeStMan – Berkeley Storage Manager. 96

8.2. Castor-SRM... 96

8.3. dCache-SRM... 97

8.4. DPM – Disk Pool Manager. 97

8.5. StoRM - Storage Resource Manager. 97

9. Appendix II : WLCG use case. 97

Introduction.. 97

9.1. Storage classes. 98

9.2. Removal policies. 99

9.3. Protocol negotiation. 99

9.4. Information discovery. 99

9.5. srmReserveSpace. 99

9.6. srmChangeSpaceForFiles. 99

9.7. srmPurgeFromSpace. 100

9.8. srmRm... 100

9.9. srmLs. 100

9.10. srmPrepareToGet. 101

9.11. srmBringOnline. 101

9.12. srmPrepareToPut. 102

9.13. srmCopy. 102

10. Contributors 103

11. Acknowledgement 103

12. Copyright Notice. 103

13. OGF Disclaimer 104

14. OGF Intellectual Property Statement 104

15. OGF Copyright Notice. 104

16. References 104

 

 


Abstract

 

Storage management is one of the most important enabling technologies for large-scale scientific investigations.  Having to deal with multiple heterogeneous storage and file systems is one of the major bottlenecks in managing, replicating, and accessing files in distributed environments.  Storage Resource Managers (SRMs), named after their web services protocol, provide the technology needed to manage the rapidly growing distributed data volumes, as a result of faster and larger computational facilities.  SRMs are Grid storage services providing interfaces to storage resources, as well as advanced functionality such as dynamic space allocation and file management on shared storage systems.  They call on transport services to bring files into their space transparently and provide effective sharing of files. SRMs are based on a common specification that emerged over time and evolved into an international collaboration.  This approach of an open specification that can be used by various institutions to adapt to their own storage systems has proven to be a remarkable success – the challenge has been to provide a consistent homogeneous interface to the Grid, while allowing sites to have diverse infrastructures.  In particular, one of the main goals to the SRM web service is to support optional features while preserving interoperability. 

 

Introduction

 

This document contains the concepts and interface specification of SRM 2.2.  It incorporates the functionality of SRM 2.0 and SRM 2.1, but is much expanded to include additional functionality, especially in the area of dynamic storage space reservation and directory functionality in client-acquired storage spaces.

 

This document reflects the discussions and conclusions of a 2-day meeting in May 2006 at Fermilab, which followed by a 3-day meeting in September 2006 at CERN.  Since that time several smaller meetings have taken place as well as email correspondence and conference calls.  The purpose of this activity is to agree on the functionality and standardize the interface of Storage Resource Managers (SRMs) – a Grid middleware component. 

This document reflects the current status of the specification, which has been frozen in order to allow multiple implementations to proceed.

 

The document is organized in seven sections.  The first describes the main concepts of SRMs as a standard middleware specification for various storage systems.  It is intended to support the same interface to simple files systems, as well as sophisticated storage system that include multiple disk caches, robotic tape systems, and parallel file systems.The second, called “Common Type Definitions” contains all the type definitions used to define the functions (or methods).  The next 5 sections contain the specification of “Space Management Functions”, “Permission Functions”, “Directory Functions”, “Data Transfer Functions” and “Discovery Functions”.  All the “Discovery Functions” are newly added functions.

 

Appendix I lists several implementations of SRM v2.2 around the world, and their deployment in various sites.

 

As can be expected, when a large collaboration decide to use the SRM specification, it may choose to restrict some of the functionality according to their common projects requirements.  For example, some collaboration may choose to restrict space reservations to administrators only, and not permit dynamic reservations by other users.  Similarly, the collaboration may choose to support only permanent storage files, rather than allow automatic removal of files whose lifetime has expired by the SRM. 

 

An interesting and influential collaboration is described in Appendix II.  The collaboration is in the High Energy Physics domain, and it purpose is to develop the tools to managed the petabytes of data expected from the Large Hadron Collider (LHC).  The collaboration, called Worldwide LHC Computing Grid (WLCG) project, involves implementing Storage Resource Managers on top of various storage systems based on the SRM v2.2 specification described here.  Appendix II described the restrictions and behaviors the WLCG project has chosen in order to achieve interoperability of all SRM implementations under a tight time schedule.  It is important to note that the WLCG collaboration also added enhancement in terms of functionality and clarity of the specification, an invaluable contribution based on practical requirements.

 

For people not familiar with SRM concepts, it is advisable to read the first chapter.  For people familiar with previous versions of SRM specifications, it is advisable to read the document SRM.v2.2.changes.doc posted at http://sdm.lbl.gov/srm-wg before reading this specification.Another SRM-related activity that was recently published is to provide a formal conceptual model of the SRM behavior [ISGC2007].

 

This document is modified from OGF.129 for more readability towards funcional interfaces. OGF.129 document fixes many typos and includes some clarifications from the previous version (2 April 2007).


1. Common Type Definitions

 

NamespaceSRM

 

1.1. Meaning of terms

 

a)      Underlinedattributes are REQUIRED.  The required attributes must be parsed correctly and must give proper error messages when not supported.

b)     By “https” we mean http protocol with GSI authentication. It may be represented as “httpg”. At this time, any implementation of http with GSI authentication could be used. It is advisable that the implementation is compatible with Globus Toolkit 3.2 or later versions.

c)      Primitive types used below are consistent with XML build-in schema types: i.e.

o        longis 64bit: (+/-) 9223372036854775807

o       intis 32 bit: (+/-) 2147483647

o       shortis 16 bit: (+/-) 32767

o       unsignedLongranges (inclusive): 0 to18446744073709551615

o       unsignedInt ranges (inclusive): 0 to 4294967295

o       unsignedShort ranges (inclusive):0 to 65535

d)     The definition of the type “anyURI” used below is compliant with the XML standard. See http://www.w3.org/TR/xmlschema-2/#anyURI.   It is defined as: "The lexical space of anyURI is finite-length character sequences which, when the algorithm defined in Section 5.4 of [XML Linking Language] is applied to them, result in strings which are legal URIs according to [RFC 2396], as amended by [RFC 2732]".

e)      In “localSURL”, we mean local to the SRM that is processing the request.

f)       authorizationID : from the SASL RFC 2222
During the authentication protocol exchange, the mechanism performs authentication, transmits an authorization identity (frequently known as a userid) from the client to server…. The transmitted authorization identity may be different than the identity in the client’s authentication credentials. This permits agents such as proxy servers to authenticate using their own credentials, yet request the access privileges of the identity for which they are proxying. With any mechanism, transmitting an authorization identity of the empty string directs the server to derive an authorization identity from the client’s authentication credentials.

g)      Regarding file sharing by the SRM, it is a local implementation decision.  An SRM can choose to share files by proving multiple users access to the same physical file, or by copying a file into another user’s space.  Either way, if an SRM chooses to share a file (that is, to avoid reading a file over again from the source site) the SRM should check with the source site whether the user has a read/write permission. Only if permission is granted, the file can be shared.

h)     The word “pinning” is limited to the “copies” or “states” of SURLs and the Transfer URLs (TURLs).

i)       For each function, status codes are defined with basic meanings for the function. Only those status codes are valid for the function. Specific cases are not stated for each status code. If other status codes need to be defined for a specific function, send an email to the collaboration to discuss the usage.

 

 

1.2. File Storage Type

enum                    TFileStorageType      {VOLATILE, DURABLE, PERMANENT}

 

o        Volatile file has a lifetime and the storage may delete all traces of the file when it expires.

o        Permanent file has no expiration time.

o        Durable file has an expiration time, but the storage may not delete the file, and should raise error condition instead.

 

1.3. File Type

enum                    TFileType                        {FILE, DIRECTORY, LINK}

 

1.4. Retention Policy

enum                    TRetentionPolicy       { REPLICA , OUTPUT ,  CUSTODIAL }

 

o        Quality of Retention (Storage Class) is a kind of Quality of Service. It refers to the probability that the storage system lose a file. Numeric probabilities are self-assigned.

·         Replica quality has the highest probability of loss, but is appropriate for data that can be replaced because other copies can be accessed in a timely fashion.

·         Output quality is an intermediate level and refers to the data which can be replaced by lengthy or effort-full processes.

·         Custodial quality provides low probability of loss.

o        The type is used to describe retention policy assigned to the files in the storage system, at the moments when the files are written into the desired destination in the storage system. It is used as a property of space allocated through the space reservation function. Once the retention policy is assigned to a space, the files put in the reserved space will automatically be assigned the retention policy of the space. The assigned retention policy on the file can be found through the TMetaDataPathDetail structure returned by the srmLs function.

 

1.5. Access Latency

enum                    TAccessLatency   { ONLINE,  NEARLINE }

 

o        These terms are used to describe how latency to access a file is improvable. Latency is improved by storage systems replicating a file such that its access latency is online.

·         The ONLINE cache of a storage system is the part of the storage system which provides file with online latencies.

·         ONLINE has the lowest latency possible. No further latency improvements are applied to online files.

·         NEARLINE file can have their latency improved to online latency automatically by staging the file to online cache.

·         For completeness, we also describe OFFLINE here.

·         OFFLINE files need a human to be involved to achieve online latency.

o        The type will be used to describe a space property that access latency can be requested at the time of space reservation. The content of the space, files may have the same or “lesser” access latency as the space.

o        For the SRM, ONLINE and NEARLINE are specified, and files may be ONLINE and/or NEARLINE.

 

1.6. Permission Mode

enum                    TPermissionMode     {NONE, X, W, WX, R, RX, RW, RWX}

 

1.7. Permission Type

enum                    TPermissionType       {ADD, REMOVE, CHANGE}

 

1.8. Request Type

enum                    TRequestType              { PREPARE_TO_GET,

                                                                                PREPARE_TO_PUT,

                                                                                COPY,

                                                                                BRING_ONLINE,

                                                                                RESERVE_SPACE,

                                                                                UPDATE_SPACE,

                                                                                CHANGE_SPACE_FOR_FILES,

LS}

 

1.9. Overwrite Mode

enum                    TOverwriteMode        {NEVER,

ALWAYS,

WHEN_FILES_ARE_DIFFERENT}

 

o        Use case for WHEN_FILES_ARE_DIFFERENT can be that files are different when the declared size for an SURL is different from the actual one, or that the checksum of an SURL is different from the actual one.

o        Overwrite mode on a file is considered higher priority than pinning a file. Where applicable, it allows to mark a valid Transfer URL to become invalid when the owner of the SURL issues an overwrite request.

 

1.10. File Locality

enum                    TFileLocality{ ONLINE, 

                                NEARLINE,

ONLINE_AND_NEARLINE,

LOST,

NONE.

UNAVAILABLE }

 

o        Files may be located online, nearline or both. This indicates if the file is online or not, or if the file reached to nearline or not. It also indicates if there are online and nearline copies of the file.

·         The ONLINE indicates that there is a file on online cache of a storage system which is the part of the storage system, and the file may be accessed with online latencies.

·         The NEARLINE indicates that the file is located on nearline storage system, and the file may be accessed with nearline latencies.

·         The ONLINE_AND_NEARLINE indicates that the file is located on online cache of a storage system as well as on nearline storage system.

·         The LOST indicates when the file is lost because of the permanent hardware failure.

·         The NONE value shall be used if the file is empty (zero size). 

·         The UNAVAILABLE indicates that the file is unavailable due to the temporary hardware failure.

o        The type is used to describe a file property that indicates the current location or status in the storage system.

 

1.11. Access Pattern

enum                    TAccessPattern   { TRANSFER_MODE,  PROCESSING_MODE }

 

o        TAccessPattern may be passed as an input parameter to the srmPrepareToGet and srmBringOnline functions. It provides a hint from the client to SRM how the Transfer URL (TURL) produced by SRM is going to be used. If the parameter value is “ProcessingMode”, the system may expect that client application will perform some processing of the partially read data, followed by more partial reads and a frequent use of the protocol specific “seek” operation. This allows optimizations by allocating files on disks with small buffer sizes. If the value is “TransferMode” the file will be read at the highest speed allowed by the connection between the server and a client.

 

1.12. Connection Type

enum                    TConnectionType   { WAN,  LAN }

 

o        TConnectionType indicates if the client is connected though a local or wide area network. SRM may optimize the access parameters to achieve maximum throughput for the connection type. This input parameter may be passed to the srmPrepareToGet, srmPrepareToPut and srmBringOnline functions.

 

 

1.13. Status Codes

enum                    TStatusCode           { SRM_SUCCESS, 

SRM_FAILURE,

                                                                                SRM_AUTHENTICATION_FAILURE,

                                                                                SRM_AUTHORIZATION_FAILURE,

                                                                                SRM_INVALID_REQUEST,

                                                                                SRM_INVALID_PATH,

                                                                                SRM_FILE_LIFETIME_EXPIRED,

                                                                                SRM_SPACE_LIFETIME_EXPIRED,

SRM_EXCEED_ALLOCATION,

                                                                                SRM_NO_USER_SPACE,

                                                                                SRM_NO_FREE_SPACE,

                                                                                SRM_DUPLICATION_ERROR,

                                                                                SRM_NON_EMPTY_DIRECTORY,

                                                                                SRM_TOO_MANY_RESULTS,

                                                                                SRM_INTERNAL_ERROR,

                                                                                SRM_FATAL_INTERNAL_ERROR,

                                                                                SRM_NOT_SUPPORTED,

                                                                                SRM_REQUEST_QUEUED,

                                                                                SRM_REQUEST_INPROGRESS,

                                                                                SRM_REQUEST_SUSPENDED,

                                                                                SRM_ABORTED,

                                                                                SRM_RELEASED,

                                                                                SRM_FILE_PINNED,

                                                                                SRM_FILE_IN_CACHE,

                                                                                SRM_SPACE_AVAILABLE,

                                                                                SRM_LOWER_SPACE_GRANTED,

                                                                                SRM_DONE,

                                                                                SRM_PARTIAL_SUCCESS,

                                                                                SRM_REQUEST_TIMED_OUT,

SRM_LAST_COPY,

SRM_FILE_BUSY,

SRM_FILE_LOST,

SRM_FILE_UNAVAILABLE,

SRM_CUSTOM_STATUS

}

 

o        SRM_NOT_SUPPORTED must be used, in general

·         If a server does not support a method

·         If a server does not support particular optional input parameters

 

 

1.14. Retention Policy Info

typedef                               struct { TRetentionPolicy            retentionPolicy,

                                                TAccessLatency              accessLatency

                                } TRetentionPolicyInfo

 

o        TRetentionPolicyInfo is a combined structure to indicate how the file needs to be stored.

o        When both retention policy and access latency are provided, their combination needs to match what SRM supports. Otherwise request must be rejected.

 

1.15. Request Token

 

o        The Request Token assigned by SRM is unique and immutable (non-reusable).  For example, if the date:time is part of the request token it can be immutable.

o        Request tokens are case-sensitive.

o        Request token is valid until the request is completed. However, SRM may choose to keep the request tokens for a short period of time after the request is completed, and the time period depends on the SRM.

 

1.16. User Permission

typedef                struct { string                                    userID,

                                                TPermissionMode          mode

} TUs