Page History

Turn Off History

description

condor_annex is (will be) a tool for expanding an HTCondor pool into the cloud. (It presently only supports AWS.)

Annex is both a verb -- the condor_annex tool doesn't scavenge cycles, but acquires them -- and a noun, referring to the collection of slots in the cloud.

motivation

A user of your HTCondor pool, Dr. Needs-Moore, needs more cycles in less time than your pool can provide. (This can happen for any number of reasons, but deadlines are a good and sufficient one.) However, Dr. Needs-Moore has money to spend on solving this problem. Solution: trade Dr. Needs-Moore's money for cycles. Enter the cloud.

constraints

We argue that the person whose money would be spent is the best one to decide when and how much to spend, so condor_annex must be usable by Dr. Needs-Moore. However, the good doctor isn't a systems administrator, so condor_annex needs to be simple and easy to use. At the same time, it should be hard to spend "too much" money. In particular, idle instances should shut themselves off, and there should be a limit to how long the instances will run in any case (a lease). (Naturally, it must be possible to extend the lease as necessary.)

Our prototype reduces the end-user's responsibilities to the following command:

condor_annex --project-id 'TheNeeds-MooreLab' --expiry '2015-12-18 23:59' --instances 16

proposed first-time user set-up

  1. Dr Needs-Moore contacts the pool administrator wanting to run more jobs (by a deadline).
  2. The administrator determines that Dr Needs-Moore needs more cycles than their pool can provide (in that time).
  3. The administrator creates an AWS user for the doctor (using the AWS web console), downloads the doctor's credentials, massages them a bit, and sends the resulting file to the doctor. The administrator also sends along a project ID.
  4. The doctor copies the file to the right place and, if our packaging hasn't already, has a grad student install the AWS command-line tool.

installation

....

Quick Start

This section explains how to set up condor_annex for single-user testing. It assumes a fair degree of familiarity with AWS and HTCondor.

AWS

Start by creating a new AWS account, or logging into your existing account. You may prefer to do the former, since condor_annex is still experimental. For the same reason, condor_annex presently assumes that it has almost unlimited privileges. (In the future, there will be a high-privilege one-time set-up, and the end user will run condor_annex with a low-privilege AWS user (or role).) You may want to create a(n IAM) user with "Admin" privileges, instead.

You may have noticed a lot of information missing from the command-line above. condor_annex does all of its AWS work through the 'aws' command-line tool, which has a configuration file containing the access key and secret key for a given AWS account (or user or role). If, like above, the command-line is incomplete, condor_annex will look up the missing information in the AWS. This allows you (the annex administrator) to update and maintain the defaults without having to distribute files to your end users.

The four default things you can set in the account are:

  1. An SSH keypair. condor_annex will use the one named "HTCondorAnnex".
  2. A VPC. condor_annex will use the one whose "name" tag is "HTCondorAnnex".
  3. VPC subnets. condor_annex will use the ones whose "name" tag is "HTCondorAnnex".
  4. Launch configurations. _condor_annex- will us the ones named "HTCondorAnnex-1" through "HTCondorAnnex-8".

To speed your quick start, we provide the following Amazon Linux AMIs with HTCondor 8.4.2 pre-installed:

The launch configurations need only the instance type, the AMI ID, and the spot price, if any. Additional specifications are presently ignored.

Once you've used the web console to create all of these objects, you'll need to install and configure the 'aws' command-line tool. You don't need to set a default region, although if you don't, you'll need to specify one on the condor_annex command-line. Something along the lines of "yum install python-pip; pip install awscli" will usually get the job done.

HTCondor

condor_annex assumes that it is run from a shell with a live HTCondor installation. In particular, the security configuration should be pool password, more or less as follows:

SEC_CLIENT_AUTHENTICATION_METHODS = FS, PASSWORD
SEC_DEFAULT_AUTHENTICATION = REQUIRED
SEC_DEFAULT_AUTHENTICATION_METHODS = FS, PASSWORD
SEC_PASSWORD_FILE = $(LOCAL_DIR)/password_file
ALLOW_WRITE = condor_pool@*/* $(FULL_HOSTNAME) $(IP_ADDRESS)

The manual has information on how to generate 'password_file'. See

https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=CreatingPersonalHtcondor

for more information about installing a personal condor. See also

https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToEnablePoolPassword

for more information about enabling pool password.