Page History
- 2015-Jun-10 15:19 emfajard
- 2015-Feb-11 12:14 emfajard
- 2014-Dec-01 15:35 emfajard
- 2014-Dec-01 15:02 emfajard
- 2014-Nov-05 10:35 emfajard
- 2014-Oct-29 14:41 emfajard
- 2014-Oct-29 13:03 tannenba
- 2014-Oct-29 10:40 emfajard
- 2014-Oct-24 10:15 emfajard
- 2014-Oct-22 16:48 emfajard
- 2014-Oct-22 15:37 emfajard
- 2014-Oct-16 11:54 jfrey
- 2014-Oct-16 11:53 jfrey
This page documents both the HTCondor configuration (i.e. changes to the default RPM) as well as any Linux kernel parameter tuning required, divided into three sections: configs/tunings on the central manager, configs/tunings required on the submit nodes, and configs/tunings on the execute machines.
StartD
UPDATE_INTERVAL=$RANDOM_INTEGER(540, 740, 1)
Central Manager
The central manager is cmssrv242.fnal.gov. It has 64GB of memory and 16 cores. It has a collector tree with 200 child collectors.
Central Manager Configuration:
- System Configurations:
net.core.rmem_max = 10000000 (via sysctl) renice -5 main Collector Process
- 00_gwms_general.config
CONDOR_HOST=$(FULL_HOSTNAME) UID_DOMAIN=$(FULL_HOSTNAME) FILESYSTEM_DOMAIN=$(FULL_HOSTNAME) LOCK = $(LOG) DAEMON_LIST = MASTER SEC_DAEMON_SESSION_DURATION = 50000 SEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_AUTHENTICATION_METHODS = FS,GSI SEC_READ_AUTHENTICATION = OPTIONAL SEC_CLIENT_AUTHENTICATION = OPTIONAL DENY_WRITE = anonymous@* DENY_ADMINISTRATOR = anonymous@* DENY_DAEMON = anonymous@* DENY_NEGOTIATOR = anonymous@* DENY_CLIENT = anonymous@* SEC_DEFAULT_ENCRYPTION = OPTIONAL SEC_DEFAULT_INTEGRITY = REQUIRED SEC_READ_INTEGRITY = OPTIONAL SEC_CLIENT_INTEGRITY = OPTIONAL SEC_READ_ENCRYPTION = OPTIONAL SEC_CLIENT_ENCRYPTION = OPTIONAL HOSTALLOW_WRITE = * ALLOW_WRITE = $(HOSTALLOW_WRITE)
- 01_gwms_collectors.config
COLLECTOR_NAME = frontend_service COLLECTOR_HOST = $(CONDOR_HOST) COLLECTOR.USE_VOMS_ATTRIBUTES = False COLLECTOR_MAX_FILE_DESCRIPTORS=80000 SCHEDD_MAX_FILE_DESCRIPTORS = 80000 SHARED_PORT_MAX_FILE_DESCRIPTORS = 80000 DAEMON_LIST = $(DAEMON_LIST), COLLECTOR, NEGOTIATOR NEGOTIATOR_POST_JOB_RANK = MY.LastHeardFrom NEGOTIATOR_INTERVAL = 60 NEGOTIATOR_MAX_TIME_PER_SUBMITTER=60 NEGOTIATOR_MAX_TIME_PER_PIESPIN=20 PREEMPTION_REQUIREMENTS = False NEGOTIATOR_INFORM_STARTD = False NEGOTIATOR.USE_VOMS_ATTRIBUTES = False NEGOTIATOR_CONSIDER_PREEMPTION = False CONDOR_VIEW_HOST = $(COLLECTOR_HOST)
- 03_gwms_local.config
GSI_DAEMON_TRUSTED_CA_DIR= /etc/grid-security/certificates GSI_DAEMON_CERT = /etc/grid-security/hostcert.pem GSI_DAEMON_KEY = /etc/grid-security/hostkey.pem CERTIFICATE_MAPFILE= /etc/condor/certs/condor_mapfile
- 10_high_availability.config
CENTRAL_MANAGER1 = cmssrv242.fnal.gov COLLECTOR_HOST = $(CENTRAL_MANAGER1)
- 11_gwms_secondary_collectors.config
COLLECTOR9620 = $(COLLECTOR) COLLECTOR9620_ENVIRONMENT = "_CONDOR_COLLECTOR_LOG=$(LOG)/Collector9620Log" COLLECTOR9620_ARGS = -f -p 9620 DAEMON_LIST=$(DAEMON_LIST), COLLECTOR9620 # Similar entry through port 9819
- 90_gwms_dns.config
# Add certs to GSI_DAEMON_NAME with these subjects: # /DC=com/DC=DigiCert-Grid/O=Open Science Grid/OU=Services/CN=... # test-010.t2.ucsd.edu, test-008.t2.ucsd.edu, test-003.t2.ucsd.edu, # pilot01/test-008.t2.ucsd.edu, gtesterpilot01/test-001.t2.ucsd.edu, # test-009.t2.ucsd.edu, test-004.t2.ucsd.edu, test-005.t2.ucsd.edu, # test-006.t2.ucsd.edu, test-007.t2.ucsd.edu, cmssrv240.fnal.gov, # test-002.t2.ucsd.edu, cmssrv241.fnal.gov
- 96_scale_tweaks.config
COLLECTOR_QUERY_WORKERS=16 CONDOR_VIEW_HOST = 127.0.0.1 MAX_FILE_DESCRIPTORS=10512 NEGOTIATOR_MAX_TIME_PER_SUBMITTER=600 NEGOTIATOR_MAX_TIME_PER_PIESPIN=500 NEGOTIATOR_TRIM_SHUTDOWN_THRESHOLD=300 NEGOTIATOR_MAX_TIME_PER_CYCLE=900 CLASSAD_LIFETIME=1500
Submit Machines
These are the submit machines:
- cmssrv240.fnal.gov
- cmssrv241.fnal.gov
- test-003.t2.ucsd.edu
- test-004.t2.ucsd.edu
- test-005.t2.ucsd.edu
- test-006.t2.ucsd.edu
- test-007.t2.ucsd.edu
- test-009.t2.ucsd.edu
- test-010.t2.ucsd.edu
Execute Machines