Page History

Turn Off History

We have found it useful to stress-test a release candidate by giving it to Igor Sfiligoi to run in glideinWMS. This exercises a lot of Condor features that are important to OSG-type users: glidein, Condor-G, CE schedd, CCB. Igor suggested that it would be a good idea of we could run this test in-house, and we agree, so we have set up the machinery to do so. This document describes roughly how we installed it.

High-level View

glidein factory setup

Go to c157.chtc.wisc.edu. Install the factory in /data1/itb-stress. Where an unprivileged user account is needed, we are using zmiller.

Documentation for the glideinWMS factory may be found here: http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.v2/install/factory_install.html

cd /data1/itb-stress
cvs -d :pserver:anonymous@cdcvs.fnal.gov:/cvs/cd_read_only co -r snapshot_091214 glideinWMS

perl-Time-HiRes appears to be already installed as part of the main perl install

httpd was already installed, so just enabled it:
/sbin/chkconfig --level 3  httpd on
/etc/init.d/httpd start

wget http://atlas.bu.edu/~youssef/pacman/sample_cache/tarballs/pacman-3.28.tar.gz
tar xvzf pacman-3.28.tar.gz
cd pacman-3.28
. ./setup.sh
cd ..

mkdir osg_1_2_4
cd osg_1_2_4/

export VDTSETUP_CONDOR_CONFIG=/data1/itb-stress/personal-condor/etc/condor_config
export VDTSETUP_CONDOR_BIN=/data1/itb-stress/personal-condor/bin

pacman -get http://software.grid.iu.edu/osg-1.2:client
. ./setup.sh

ln -s /etc/grid-security/certificates globus/TRUSTED_CA

cd ..

yum install rrdtool-python


M2Crypto python module appears to be already installed
  m2crypto-0.16-6.el5.3.x86_64
  This is older than Igor's requirements 0.17, but I am trying it to see if it works.

Download javascriptrrd-0.4.2.zip and unzip it.

Download flot-0.5.tar.gz and untar it.

I made a service cert for glideinwms and installed it in certs.  (I used the following command.)
cert-gridadmin -email dan@hep.wisc.edu -affiliation osg -vo glow -host c157.chtc.wisc.edu -service glideinwms

A proxy for the factory may be made with this command:
grid-proxy-init -cert certs/c157.chtc.wisc.edu_glideinwms_cert.pem -key certs/c157.chtc.wisc.edu_glideinwms_key.pem -hours 100 -out certs/c157.chtc.wisc.edu_proxy.pem


cd glideinWMS/install
chown zmiller /data1/itb-stress
su zmiller
./glideinWMS_install


choose [4] pool Collector
Where do you have the Condor tarball? /afs/cs.wisc.edu/p/condor/public/binaries/v7.4/7.4.1/condor-7.4.1-linux-x86_64-rhel5-dynamic.tar.gz

Where do you want to install it?: [/home/zmiller/glidecondor] /data1/itb-stress/glidecondor
Where are the trusted CAs installed?: [/data1/itb-stress/osg_1_2_4/globus/TRUSTED_CA] /etc/grid-security/certificates


Will you be using a proxy or a cert? (proxy/cert) cert
Where is your certificate located?: /data1/itb-stress/certs/c157.chtc.wisc.edu_glideinwms_cert.pem
Where is your certificate key located?: /data1/itb-stress/certs/c157.chtc.wisc.edu_glideinwms_key.pem


What name would you like to use for this pool?: [My pool] ITB-Stress
How many secondary schedds do you want?: [9]

In this version of glideinWMS, the condor_config.local needs to be edited
to replace HOSTALLOW with ALLOW.


./glideinWMS_install
[2] Glidein Factory

Where is your proxy located?: /data1/itb-stress/certs/c157.chtc.wisc.edu_proxy.pem
Where will you host your config and log files?: [/home/zmiller/glideinsubmit] /data1/itb-stress/glideinsubmit

Give a name to this Glidein Factory?: [mySites-c157] ITB-Stress

Do you want to use CCB (requires Condor 7.3.0 or better)?: (y/n) y
Do you want to use gLExec?: (y/n) n
Force VO frontend to provide its own proxy?: (y/n) [y]
Do you want to fetch entries from RESS?: (y/n) [n]
Do you want to fetch entries from BDII?: (y/n) [n]

Entry name (leave empty when finished): ITB
Gatekeeper for 'ITB': vdt-itb.cs.wisc.edu/jobmanager-condornfslite
RSL for 'ITB':
Work dir for 'ITB':
Site name for 'ITB': [ITB]

Should glideins use the more efficient Match authentication (works for Condor v7.1.3 and later)?: (y/n) y
Do you want to create the glidein (as opposed to just the config file)?: (y/n) [n]

/data1/itb-stress/glideinWMS/creation/create_glidein /data1/itb-stress/glideinsubmit/glidein_v1_0.cfg/glideinWMS.xml

Submit files can be found in /data1/itb-stress/glideinsubmit/glidein_v1_0
Support files are in /var/www/html/glidefactory/stage/glidein_v1_0
Monitoring files are in /var/www/html/glidefactory/monitor/glidein_v1_0

cd /data1/itb-stress
# make setup.sh set X509_CERT_DIR and X509_USER_PROXY
. ./setup.sh
glideinsubmit/glidein_v1_0/factory_startup start


I added the following entries to glidecondor/certs/condor_mapfile:
GSI "/DC=org/DC=doegrids/OU=Services/CN=glideinwms/c158.chtc.wisc.edu" vofrontend
FS zmiller gfactory

The FS mapping of zmiller to gfactory appears to be necessary to make the factory
ad have AuthenticatedIdentity gfactory, which is what I configured the VO frontend
to expect.  Not sure why it is using FS to authenticate to the collector though.

submit node setup

The submit node is c158.chtc.wisc.edu.

The documentation for setting up the glideinWMS VO frontend and pool collector are here: http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.v2/install/

This node will run the glidein pool collector, the schedd that submits jobs, and the
glideinWMS VO frontend.

Install all the same glideinWMS prereqs as on c057.

I made a service cert for glideinwms and installed it in certs.  (I used the following command.)
cert-gridadmin -email dan@hep.wisc.edu -affiliation osg -vo glow -host c158.chtc.wisc.edu -service glideinwms

To create the proxy used by the VO frontend:
grid-proxy-init -cert certs/c158.chtc.wisc.edu_glideinwms_cert.pem -key certs/c158.chtc.wisc.edu_glideinwms_key.pem -hours 100 -out certs/c158.chtc.wisc.edu_proxy.pem

cd glideinWMS/install/
./glideinWMS_install

[4] User Pool Collector
[5] User Schedd
Please select: 4,5

Where do you have the Condor tarball? /afs/cs.wisc.edu/p/condor/public/binaries/v7.4/7.4.1/condor-7.4.1-linux-x86_64-rhel5-dynamic.tar.gz

Where do you want to install it?: [/home/zmiller/glidecondor] /data1/itb-stress/glidecondor

Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y

Do you want to get it from VDT?: (y/n) n
Where are the trusted CAs installed?: [/etc/grid-security/certificates]

Where is your certificate located?: /data1/itb-stress/certs/c158.chtc.wisc.edu_glideinwms_cert.pem
Where is your certificate key located?: /data1/itb-stress/certs/c158.chtc.wisc.edu_glideinwms_key.pem

How many slave collectors do you want?: [5] 50
What name would you like to use for this pool?: [My pool] ITB-Stress-Submit
What port should the collector be running?: [9618]

(I had to hand edit condor_config.local to change HOSTALLOW --> ALLOW)

Installing user submit schedds

Do you want to use the more efficient Match authentication (works for Condor v7.1.3 and later)?: (y/n) y
How many secondary schedds do you want?: [9]

./glideinWMS_install
[7] VO Frontend

Where is javascriptRRD installed?: /data1/itb-stress/javascriptrrd-0.4.2

Where is your proxy located?: /data1/itb-stress/certs/c158.chtc.wisc.edu_proxy.pem

For security reasons, we need to know what will the WMS collector map us to.
It will likely be something like joe@collector1.myorg.com
What is the mapped name?: vofrontend@c157.chtc.wisc.edu
Where will you host your config and log files?: [/home/zmiller/frontstage] /data1/itb-stress/frontstage

Where will the web data be hosted?: [/var/www/html/vofrontend]

What node is the WMS collector running?: c157.chtc.wisc.edu
What is the classad identity of the glidein factory?: [gfactory@c157.chtc.wisc.edu]

Collector name(s): [c158.chtc.wisc.edu:9618]c158.chtc.wisc.edu:9618c158.chtc.wisc.edu:9620-9669

What kind of jobs do you want to monitor?: [(JobUniverse==5)&&(GLIDEIN_Is_Monitor =!= TRUE)&&(JOB_Is_Monitor =!= TRUE)]

The following schedds have been found:
 [1] c158.chtc.wisc.edu
 [2] schedd_jobs1@c158.chtc.wisc.edu
 [3] schedd_jobs2@c158.chtc.wisc.edu
 [4] schedd_jobs3@c158.chtc.wisc.edu
 [5] schedd_jobs4@c158.chtc.wisc.edu
 [6] schedd_jobs5@c158.chtc.wisc.edu
 [7] schedd_jobs6@c158.chtc.wisc.edu
 [8] schedd_jobs7@c158.chtc.wisc.edu
 [9] schedd_jobs8@c158.chtc.wisc.edu
 [10] schedd_jobs9@c158.chtc.wisc.edu
Do you want to monitor all of them?: (y/n) y

What kind of jobs do you want to monitor?: [(JobUniverse==5)&&(GLIDEIN_Is_Monitor =!= TRUE)&&(JOB_Is_Monitor =!= TRUE)]
Give a name to the main group: [main]
Match string: [True]

VO frontend proxy = '/data1/itb-stress/certs/c158.chtc.wisc.edu_proxy.pem'
Do you want to use it to submit glideins: (y/n) [y]


/data1/itb-stress/glideinWMS/creation/create_frontend /data1/itb-stress/frontstage/instance_v1_0.cfg/frontend.xml
Work files can be found in /data1/itb-stress/frontstage/frontend_myVO-c158-v1_0
Support files are in /var/www/html/vofrontend/stage/frontend_myVO-c158-v1_0
Monitoring files are in /var/www/html/vofrontend/monitor/frontend_myVO-c158-v1_0

Added /DC=org/DC=doegrids/OU=Services/CN=glideinwms/c157.chtc.wisc.edu to GSI_DAEMON_NAME.

# To start it up:
cd frontstage/frontend_myVO-c158-v1_0/
./frontend_startup start

# Logs are in
frontstage/frontend_myVO-c158-v1_0/group_main/log
frontstage/frontend_myVO-c158-v1_0/log


Initially, the glideins were failing on startup because the -dir argument to glidein_startup.sh
had no argument.  We edited the factory xml file and set the directory to '.':

bash-3.2$ vi ../glidein_v1_0.cfg/glideinWMS.xml
bash-3.2$ ./factory_startup reconfig ../glidein_v1_0.cfg/glideinWMS.xml
Shutting down glideinWMS factory v1_0@ITB-Stress:          [OK]
Reconfigured glidein 'v1_0'
Active entries are:  ITB
Submit files are in /data1/itb-stress/glideinsubmit/glidein_v1_0
Reconfiguring the factory                                  [OK]
Starting glideinWMS factory v1_0@ITB-Stress:               [OK]

setting up CE

The CE on vdt-itb.cs.wisc.edu was already set up except for a condor jobmanager.

SET UP CONDOR

1) installed condor-rhel5.repo into /etc/yum.repo.d

2) yum install condor

3) edit /etc/condor/condor_config?

4) edit /etc/condor/condor_config.local
   change CONDOR_HOST = chopin.cs.wisc.edu
   change COLLECTOR_HOST = $(CONDOR_HOST):15396
   change DAEMON_LIST = MASTER, SCHEDD

5) export CONDOR_CONFIG=/etc/condor/condor_config

6) /usr/sbin/condor_master

okay, that's working.


SET UP JOBMANAGER

7) cd /opt/itb

8) DO NOT DO THIS: pacman -trust-all-caches -get http://software.grid.iu.edu/osg-1.1.14:Globus-Condor-Setup

9) do:
 [root@vdt-itb ~]# export CONDOR_CONFIG=/etc/condor/condor_config
 [root@vdt-itb ~]# export VDT_CONDOR_LOCATION=/usr
 [root@vdt-itb ~]# export VDT_CONDOR_CONFIG=/etc/condor/condor_config

10) cd /opt/itb/

11)   pacman -install http://vdt.cs.wisc.edu/vdt_2099_cache:Globus-CondorNFSLite-Setup

12)   $VDT_LOCATION/globus/lib/perl/Globus/GRAM/JobManager/condornfslite.pm
	update condor_config location, condor_rm, condor_submit

13) edit $VDT_LOCATION/edg/etc/grid-mapfile-local, add DNs
14) run $VDT_LOCATION/edg/sbin/edg-mkgridmap