This document explains our experience in ESG data node installation at NERSC.
Please note that there is an installation script provided at the distribution site in order to automate the process (visit http://rainbow.llnl.gov/dist). The esg-node script prepares environment variables, downloads required software, installs and configures all necessary components, and also checks for updates automatically. However, it requires root privileges, and some of the installation directories and path information are embedded inside the script.
In order to complete installation without root privileges, we have edited the installation scripts (see esg-node-nerc and esg-globus-nersc ). Please take a look at those, you will need to change installation directory, log files, user passwords, script directory, etc. Besides, you might need to manually start/stop some services such as PostgreSQL. Here is the diff output for those modified scripts:
The following is a step-by-step guide to install ESG Data node components based on the information given in the installation script (version 0.2.9).
Here is a list of components you need to have before starting installation. Also you need an account from ESG gateway portal (from association gateway, i.e.: http://pcmdi3.llnl.gov/esgcet - with publishing role). You will need this to authenticate with MyProxy client.
Note: Make sure your account has been added as a data publisher!
We assume that you already have necessary components installed (PostgreSQL, Apache Ant, JAVA, git, and curl), and PATH and LD_LIBRARY_PATH are set properly. If not, see NecessaryComponents first.
Determine an installation directory and set an environment variable for the installation directory, INSTALL_HOME (that will be referred in the next steps).
Create data and log directories for PostgreSQL, and dont forget to set proper ownership/permission for those directories.
export PGDATA=$INSTALL_HOME/pgsql/data mkdir -p $PGDATA mkdir -p $INSTALL_HOME/pgsql/log chmod 700 $PGDATA
Initialize the database by running initdb;
$initdb -D $PGDATA
By default, "trust" authentication has been enabled for local connections. Change this by editing "pg_hba.conf". Or, give -A option while running initdb command. It is recommended to use "md5" since it sends encrypted passwords. In "trust" authentication, any local user can connect to the database.
Note: I would recommend to change it to "md5" after you run esgsetup --db (after CDAT installation) - had a problem in this when esgsetup is connecting to the database.
Start database:
pg_ctl -D $PGDATA start Set environment variables: export PGUSER=dbsuper export PGPORT=5432 export PGHOST=localhost
And, create a database user ("dbsuper") and set a password - this will be needed while setting up ESGCET later.
createuser -P -s -e dbsuper
Edit "$PGDATA/postgresql.conf", to change port number (default is 5432) and other parameters such as logging options ( log directory and log filename).
Verify your installation by running:
psql -U dbsuper postgres
Set an installation directory for CDAT (Climate Data Analysis Tool):
export CDAT_HOME=$INSTALL_HOME/cdat
We will be using version eb8b668. Download the package and compile it...
git clone http://esg-repo.llnl.gov/git/cdat.git cd cdat git checkout eb8b668
If you have Python already installed, specify the path. Note that Python should have tk/tcl support, install Tkinter.
./configure --prefix=$CDAT_HOME --with-python=/usr/bin/python --enable-esg make
Alternatively, if you dont give the Python path, CDAT installer will download and install Python itself.
./configure --prefix=$CDAT_HOME --enable-esg make
Note: In Ubuntu; first install Tkinter packages and then specify the path for Python while running "configure". Python installed by CDAT (default) does not work somehow (saying missing tk/tcl support in Python).
Also update path information (make sure cdat/bin is before /usr/bin in your path):
export PATH=$CDAT_HOME/bin:$CDAT_HOME/Externals/lib:$PATH export LD_LIBRARY_PATH=$CDAT_HOME/lib:$CDAT_HOME/Externals/lib:$LD_LIBRARY_PATH
Dowload ESGCET package (that will be required scripts and packages for publishing)
wget http://rainbow.llnl.gov/dist/externals/esgcet-2.4-py2.6.egg chmod 755 esgcet-2.4-py2.6.egg easy_install esgcet-2.4-py2.6.egg
Complete the setup by giving an organization ID (rootid in the following):
bin/esgsetup --config --rootid nersc
$HOME/.esgcet/esg.ini will be created and initial configuration will be saved in esg.ini file.
Before proceeding further, please make sure that PosgreSQL is up and running. Run esgsetup to create database entries. It will ask the database admin user (dbsuper), and will create esgcet database with owner esgcet (you also need to set a password for esgcet database user).
$CDAT_HOME/bin/esgsetup --db
Update environment by adding the organization ID as follows (advised);
export ESG_ROOT_ID=nersc
Note that you might need to edit ~/.esgcet/esg.ini to set the password for esgcet database user.
There is already a "test" project defined in esg.ini
Dowload a sample data file and scan this sample dataset and publish (ESG_ROOT_ID is nersc). Note that this sample file should be inside the Thredds root catalog directory! Also. you need to specify the full path of the directory while running esgscan_directory.
mkdir $INSTALL_HOME/data/testdir cd testdir wget http://rainbow.llnl.gov/dist/externals/sftlf.nc cd .. esgscan_directory --dataset pcdmi.nersc.test --project test $INSTALL_HOME/data/testdir > scan.out esgpublish --map scan.out --project test
Download and install tomcat:
wget http://download.filehat.com/apache/tomcat/tomcat-6/v6.0.26/bin/apache-tomcat-6.0.26.tar.gz tar xvf apache-tomcat-6.0.26.tar.gz -C $INSTALL_HOME cd $INSTALL_HOME ln -s apache-tomcat-6.0.26 tomcat
Set TOMCAT_HOME environment variable:
export TOMCAT_HOME=$INSTALL_HOME/tomcat cd $TOMCAT_HOME/bin tar xvf jsvc.tar.gz cd jsvx-src autoconf chmod 755 configure; ./configure --with-java=$JAVA_HOME make cp jsvc $TOMCAT_HOME/bin
Next step is to configure tomcat by editing $TOMCAT_HOME/conf/server.xml.
Make sure server.xml has appropriate permissions (chmod 600 server.xml). By default, port 8080 and 8443 will be used. If you want to change and use 80 and 443 instead, edit Connector port numbers in server.xml.
You may want to look at Tomcat documentation for Servlet/JSP and SSL configuration.
Now, we setup the keystore:
$JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA -keystore $TOMCAT_HOME/conf/keystore-tomcat -validity 365
It will ask keystore password and key password for tomcat (default is "changeit").
Go to conf directory and download truststore file:
cd $TOMCAT_HOME/conf wget http://rainbow.llnl.gov/dist/externals/jssecacerts
Open server.xml and edit path for keystore and truststore. You can search keystore_file and truststore_file in this preconfigured sample server.xml file.
It is beneficial to create a "tomcat" user and start tomcat with this user's privileges. In that case, change the ownership of the tomcat directory
(chmod -R tomcat $TOMCAT_HOME).
It is useful to set CATALINA_HOME environment variable. You can start/stop tomcat using the cataline.sh script.
export CATALINA_HOME=$TOMCAT_HOME $CATALINA_HOME/bin/catalina.sh stop $CATALINA_HOME/bin/catalina.sh start
In order to use jsvc, start tomcat (make sure JAVA_HOME is set) by running the following command (preparing a startup script will be helpful);
cd $TOMCAT_HOME /bin/jsvc -Djava.endorsed.dirs=./endorsed -pidfile /tmp/tomcat-jsvc.pid \ -cp $TOMCAT_HOME/bin/bootstrap.jar:$TOMCAT_HOME/bin/tomcat-juli.jar:$TOMCAT_HOME/bin/commons-daemon.jar \ -outfile ./logs/catalina.out -errfile ./logs/catalina.err -Xmx2048m -Xms1024m \ -Dsun.security.ssl.allowUnsafeRenegotiation=true org.apache.catalina.startup.Bootstrap
Stop tomcat by running the following (jsvc);
cd $TOMCAT_HOME ./bin/jsvc -pidfile /tmp/tomcat-jsvc.pid -stop org.apache.catalina.startup.Bootstrap
Download Thredds war file and put into the tomcat "webapps" directory:
cd $TOMCAT_HOME/webapps wget http://rainbow.llnl.gov/dist/thredds/4.1.6/thredds.war
Restart tomcat (after restart the war file will be extracted under webapps directory)
Edit $TOMCAT_HOME/conf/tomcat-user.xml. Search for user entry in tomcat-user.xml and add a user ( dnode_user) with administrative privileges. The entry should look like:
<tomcat-users> <role rolename="tdsConfig"/> <role rolename="manager"/> <role rolename="tdrAdmin"/> <user username="dnode_user" password="digest_password_here" roles="tdrAdmin,tdsConfig"/> </tomcat-users>
First, generate a password hash by running
$TOMCAT/bin/digest.sh -a SHA <password for dnode_user>
Use this password hash and add line, shown below, to the tomcat-user.xml file. Then, restart the tomcat.
<user_entry='<user username="dnode_user" password="<PASSWORD_HASH_HERE>" roles="tdrAdmin,tdsConfig">
Configure tomcat for digest authentication. Create directory $TOMCAT_HOME/conf/Catalina/localhost if does not exists. Add or edit thredds.xml file in $TOMCAT_HOME/conf/Catalina/localhost. It should look like:
<?xml version="1.0" encoding="UTF-8"?> <Context path="/thredds"> <Realm className="org.apache.catalina.realm.MemoryRealm" digest="SHA" /> </Context>
A sample web.xml file is given here . Make sure SSL is enabled (this is used by the ESG-publisher to re-initialize Thredds Data server andcheck logs). It should look like:
<user-data-constraint> <transport-guarantee>CONFIDENTIAL</transport-guarantee> </user-data-constraint>
Note: esg.ini is important. Make sure "thredds_url" "thredds_reinit_error_url" and "thredds_reinit_url" are correct (they should point to full host name - not localhost)
Here, we are using gateway node ESG-PCMDI(pcmdi3.llnl.gov/esgcet) as myProxy end-point (default myProxy port 2119).
Download necessary classes into a temporary directory:
cd $TOMCAT_HOME/temp wget http://rainbow.llnl.gov/dist/utils/InstallCert.class wget http://rainbow.llnl.gov/dist/utils/InstallCert$SavingTrustManager.class
End-point is pcmdi3.llnl.gov, SSL port is 443, and default password for SSL end point is "changeit"
cd $TOMCAT_HOME/conf cp jssecacerts jssecacerts.bak $JAVA_HOME/bin/java -classpath .:$TOMCAT_HOME/temp InstallCert pcmdi3.llnl.gov:443 <password>
This will add certificate to keystore "jssecacerts". Change owner and permission of that file (chmod 644 jssecacerts; chown tomcat jssecacerts).
Note: Copy jssecacerts into $JAVA_HOME/jre/lib/security (Installation script does this but probably this is not necessary! Its path has been specified in server.xml already) cp -p $TOMCAT_HOME/conf/jssecacerts $JAVA_HOME/jre/lib/security
Add following into the environmen (optional)
export ESG_GATEWAY_NAME=ESG-PCMDI export ESG_GATEWAY_SVC_ROOT=pcmdi3.llnl.gov/esgcet export MYPROXY_SERVER=pcmdi3.llnl.gov
Download ESG token validator filters
cd $TOMCAT_HOME/webapps/thredds/WEB-INF/lib wget http://rainbow.llnl.gov/dist/filters/eske.jar wget http://rainbow.llnl.gov/dist/filters/hessian-3.0.20.jar
Now, you need to edit $TOMCAT_HOME/webapps/thredds/WEB-INF/web.xml and add the following filter specifications:
Add the following ESG security token filter and servlet entries into the web.xml:
A sample web.xml file is given here (send an email).
More information about ESG token validation filter can be found at ESG data node documentation
Restart Tomcat. Make sure PostgreSQL is running.
$CDAT_HOME/bin/esgsetup --thredds --publish --gateway pcmdi3.llnl.gov
In this step, you need to specify Thredds content directory and ESG data path root directory. If they dont exist, create root directory (and replica directory).
mkdir $INSTALL_HOME/data mkdir $INSTALL_HOME/data.replica
You may also need to edit esg.ini file and change the path for content directory. It should look like (give full path)
thredds_dataset_roots = esg_dataroot | /project/projectdirs/esg/datanode/data
Make sure thredds_username and thredds_password are set correctly.
Verify whether everything is configured properly (dont forget to restart Tomcat) by creating Tredds catalog for the data set we have scanned before. (ESG_ROOT_ID is nersc).
esgpublish --use-existing pcdmi.nersc.test --noscan --thredds
This step might take some time. It will reinitialize the Thredds Data Server, so make sure url's are set correctly in ~/.esgcet/esg.ini
cd $TOMCAT_HOME/temp/ wget http://rainbow.llnl.gov/dist/esg-node/esg-node.0.0.2.tar.gz tar xzf esg-node.0.0.2.tar.gz cd esg-node.0.0.2
Go to Tomcat webapp directory, and replace tokens in node.properties
mkdir -p $TOMCAT_HOME/webapps/esg-node cd $TOMCAT_HOME/webapps/esg-node jar xvf $TOMCAT_HOME/esg-node.0.0.2/esg-node.war cd WEB-INF/classes
Edit node.properties. Change the following options in node.properties file (in webapps/esg-node/WEB-INF/classes)
db.driver -> org.postgresql.Driver db.protocol -> jdbc:postgresql db.host -> localhost db.port -> 5432 db.database -> esgcet db.user -> dbsuper db.password -> <dbsuper password> mail.smtp.host -> <mail.admin.address>
Create esgcet database if not created yet
createdb esgcet
Configure PostgreSQL by running:
cd $TOMCAT_HOME/temp/esg-node.0.0.2/db ant -buildfile database-tasks.ant.xml \ -Dnode.property.file=$TOMCAT_HOME/webapps/esg-node/WEB-INF/classes/node.properties \ -Dsql.jdbc.base.url=jdbc:postgresql://localhost:5432/ -Dsql.jdbc.database.name=esgcet \ -Dsql.jdbc.database.user=dbsuper \ -Dsql.jdbc.database.password=<dbsuper_password> -Dsql.jdbc.driver.jar=$TOMCAT_HOME/webapps/esg-node/WEB-INF/lib/postgresql-8.3-603.jdbc3.jar \ make_node_db
Restart Tomcat.
Set installation directory (INSTALL_HOME) and create an environment file "$INSTALL_DIR/env.sh", so it can be used for sourcing the environment.
cat $INSTALL_HOME/env.sh << EOF export CDAT_HOME=$INSTALL_HOME/cdat export TOMCAT_HOME=$INSTALL_HOME/tomcat export CATALINA_HOME=$TOMCAT_HOME export GLOBUS_HOME=$INSTALL_HOME/globus export LD_LIBRARY_PATH=$CDAT_HOME/lib:$CDAT_HOME/Externals/lib:$GLOBUS_HOME/lib:$LD_LIBRARY_PATH export PATH=$CDAT_HOME/bin:$CDAT_HOME/Externals/bin:$TOMCAT_HOME/bin:$GLOBUS_HOME/bin:$PATH export PGDATA=$INSTALL_HOME/pgsql/data export PGUSER=dbsuper export PGPORT=5432 export PGHOST=localhost export ESG_ROOT_ID=nersc export ESG_GATEWAY_NAME=ESG-PCMDI export ESG_GATEWAY_SVC_ROOT=pcmdi3.llnl.gov/esgcet export MYPROXY_SERVER=pcmdi3.llnl.gov export X509_CERT_DIR=~/.globus/certificates EOF
myproxy-logon -s pcmdi3.llnl.gov -l <username_of_your_account_from gateway> -p 2119 -o ~/.globus/certificate-file -T esglist_files pcmdi.nersc.test esgpublish --use-existing pcmdi.nersc.test --noscan --publish esgunpublish --skip-thredds pcmdi.nersc.test
export CURL_HOME=$INSTALL_HOME/curl wget http://curl.haxx.se/download/curl-7.20.1.tar.gz tar xvzf curl-7.20.1.tar.gz cd curl-7.20.1 ./configure --prefix=$CURL_HOME make all make install $CURL_HOME/bin/curl --version export PATH=$CURL_HOME/bin:$PATH export LD_LIBRARY_PATH=$CURL_HOME/lib:$LD_LIBRARY_PATH
export GIT_HOME=$INSTALL_HOME/git wget http://kernel.org/pub/software/scm/git/git-1.7.1.tar.gz tar xvzf git-1.7.1.tar.gz cd git-1.7.1 ./configure --prefix=$GIT_HOME make all make install export PATH=$GIT_HOME/bin:$PATH export LD_LIBRARY_PATH=$GIT_HOME/lib:$LD_LIBRARY_PATH
export JAVA_HOME=$INSTALL_HOME/java wget http://rainbow.llnl.gov/dist/java/1.6.0_20/jdk1.6.0_20-32.tar.gz tar xvfz jdk1.6.0_20-32.tar.gz -C $INSTALL_HOME ln -s $INSTALL_HOME/jdk1.6.0_20 $JAVA_HOME $JAVA_HOME/bin/java --version export PATH=$JAVA_HOME/bin:$PATH
export ANT_HOME=$INTALL_HOME/ant wget http://www.trieuvan.com/apache/ant/binaries/apache-ant-1.8.1-bin.tar.gz tar xvfz apache-ant-1.8.1-bin.tar.gz -C $INSTALL_HOME ln -s $INSTALL_HOME/apache-ant-1.8.1 $ANT_HOME $ANT_HOME/bin/ant -version export PATH=$ANT_HOME/bin:$PATH
export PGHOME=$INSTALL_HOME/pgsql wget http://ftp9.us.postgresql.org/pub/mirrors/postgresql/source/v8.4.3/postgresql-8.4.3.tar.gz tar xvzf postgresql-8.4.3.tar.gz cd postgresql-8.4.3 ./configure --prefix=$PGHOME --enable-thread-safety make make install cd contrib/tablefunc make make install export PATH=$PGHOME/bin:$PATH export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH