SDM
People Publications Projects

Back to DML Home

Instructions to download and run DataMover-Lite (DML) v2.2.0

Support

Instructions

Go to Download page and click on “download DML” to any directory or desktop. It will download a Gzip’ed file.

Unzip the file, and extract content into any directory or desktop. It will create a directory “DML-gt4”.

If you have a Grid Certificate, and wish to use GridFTP, and you are running on a Windows machine, follow instructions in Appendix A below, otherwise skip this step.

Locate java directory (usually under Program Files on Windows). Check that jre1.5.0 or above exists; if not, you need to download it from java.sun.com.
[To download jre1.5.0 on Windows: click on SE under popular downloads; click on “previous release” at top menu bar; click on “J2SE 5.0 Downloads”; find “Java Runtime Environment (JRE) 5.0 Update 11”; click on “download”; click on “accept”; click on “Windows Offline Installation, Multi-language”; it will download into desktop usually; click on it to execute; it will install it under C:\Program Files\Java.]

For Windows: in the DML-gt4 directory, you need to edit the file: gui.bat. Open with Notepad, and add C:\Program Files\Java\jre1.5.0_11\bin\ in front of the first word: java…, then put quotes (“) before and after C:\Program Files\Java\jre1.5.0_11\bin\java. (For Unix, skip this step).

For Windows: double click gui.bat, it will pop up an DML window.
For Unix: double click gui.sh, it will pop up an DML window.

If you use GridFTP, setup Grid certificate and proxy: follow steps Appendix B below. This needs to be done for either windows or Unix.

On main DML screen, click on “browse” to choose a target directory for future downloaded files. It can be any directory.

You need to have an input-xml file with list of files in the format that is explained below in Appendix C. An example file that uses three different transfer protocols is the file mixedrequest.xml in under DML-gt4/samples.

Now, in main DML screen, click on file -> import, and choose your input-xml file. Select file and click “open”. It will show the information about the file on the DML screen. (to try a single transfer to verify that installation is OK, select the file: “sample.http.xml” form the DML-gt4/samples directory).

In main DML screen, click on “transfer”. The screen will show the progress of the file downloads into your target-directory.

Setup of advanced parameters can be done in order to have concurrent file transfers that can speed up the total transfer rates, as well as GridFTP parameter setup for window buffer size and number of parallel streams. See Appendix D.

Appendices

*A.* Grid certificate setup in DML for Windows

*B.* Configuration setup of Grid Certificate in DML

*C.* The format of the input-xml file
The input-xml file is formatted to contain a list of files and their sizes in an XML format. For example:


      http://www.lbl.gov/index.html 
      24576 
  

Note that the file name is a URL with the protocol being “http” in this case.
Note also that the size is in bytes.
Multiple file lists have to be wrapped in labels as shown in mixedrequest.xml for example.

*D.* Setting up ESG credentials

This applies only to Earth System Grid (ESG) users. You need to setup the LAHFS properties location. Click on “tools -> config”, select “lahfs-properties-location”, click on entry under “Browse a file”, and go to install-directory “DML-gt4”. Click on file ‘lahfs.properties”, and then click “open”. This path will show at the entry for LAHFS under “Browse a file”. Click on “save” button on the left-hand bottom.

*E.* Speeding up the file transfers

DML is capable of transferring multiple files concurrently, in order to achieve better global transfer rates. For example, if the transfer rate of a single file is limited by the transfer protocol to be 1 MB/s, having 5 concurrent file transfers should provide 5 MB/s if the network connection permits that. However, some machines (such as regular laptops) cannot support many concurrent transfers since each requires the allocation of a buffer space.

The default “concurrency” is set to 1. To setup higher concurrency, click on “options -> concurrent transfers” and change to the desired level. Start with a few to see if your machine can handle that, and increase till no benefit is achieved or the operations slows down.

If you are using GridFTP, two parameters can be setup. One is “parallel streams” which instructs GridFTP to send multiple streams for the same file transfer. This is useful if the files transferred are very large, in the order of many GBs. It is similar to concurrent transfers but apply to a single file transfer. Again, having too many streams seems to have diminishing return. It is generally advisable to go no higher than 6-7 parallel streams.

The default “parallelism” level is set to 1. To setup higher concurrency, click on “options -> parallelism” and change to the desired level. Start with a few (2-3) to see if your machine can handle that, and increase till no benefit is achieved or the operations slows down.

The second parameter that can be setup for GridFTP is the “window buffer size”. This tells the GridFTP transfer software to move data in chunks of a certain size. Larger “window sizes” are better when large files are transferred.

The default “window buffer size” level is set to about 1 MB. To setup higher or lower buffer size, click on “options -> buffer size” and change to the desired level. 1 MB is the commonly recommended level, but if the receiving end can handle larger buffer sizes, increasing to 2 or more MBs can speed up transfers.