Page History

Turn Off History

This guide assumes that you already have an AWS account, as well as a log-in account on a Linux machine with an administrator who's willing to open a port for you.

Before using condor_annex for the first time, you'll have to do three things:

  1. install a personal Condor
  2. prepare your AWS account
  3. configure condor_annex

Instructions for each follow.

Install a personal Condor

We recommend that you install a personal condor to make use of condor_annex; it's simpler to configure that way. Get started by following the instructions for CreatingPersonalHtcondor; be sure to download one of the tarballs for version 8.7.0 or later.

For the rest of these instructions, where you see LOCAL_DIR, replace it with the LOCAL_DIR defined by the installation instructions above; the value used in the examples was /scratch/local/condor84.

Configure a pool password

In this section, you'll configure your personal Condor to use a pool password. This is a simple but effective method of securing Condor's communications to AWS.

Add the following lines to LOCAL_DIR/condor_config.local.

LOCAL_DIR/condor_config.local
SEC_PASSWORD_FILE = LOCAL_DIR/condor_pool_password
SEC_DAEMON_INTEGRITY = REQUIRED
SEC_DAEMON_AUTHENTICATION = REQUIRED
SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD
SEC_NEGOTIATOR_INTEGRITY = REQUIRED
SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED
SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD
SEC_CLIENT_AUTHENTICATION_METHODS = FS, PASSWORD
ALLOW_DAEMON = condor_pool@*

You must also run the following command, which prompts you to enter a password:

$ condor_store_cred -c add
Account: condor_pool@yourmachine.tld

Enter password:

Enter a password.

(For more details, see HowToEnablePoolPassword.)

Tell HTCondor about the open port

By default, HTCondor will use port 9618. If the Linux machine doesn't already have HTCondor installed, and the admin is willing to open that port, then you don't have to do anything. Otherwise, you'll need to add a line like the following to LOCAL_DIR/condor_config.local, where you must replace '9618' with whatever port the administrator opened for you.

LOCAL_DIR/condor_config.local
SHARED_PORT_PORT = 9618


Prepare your EC2 account

You will need to provide HTCondor with an access key/secret key pair. For security reasons, you specify the location of a file containing the secret key instead of specifying the secret key directly; the same goes for the access key. If you don't already have these keys, you can create new pair from the AWS web console; the process varies depending on which kind of account you have. (FIXME: (link to) Instructions for the root account.)

Create a private S3 bucket

You'll need access to a private S3 bucket. (FIXME: (link to) Instructions for creating private bucket.) In the following instructions, we'll call this bucket 'privateBucketName'; replace that string, when you see it, with the actual name of the private bucket you created in this step.

Prepare the lease machineery

To avoid having to upload it every time, the annex assumes that the Lambda function its needs already exists and is configured to run as a role with the required permissons. We've provided a CloudFormation template that will create and configure the Lambda function for you [FIXME: where?]. Instructions follow for readers who haven't created a stack from a template file before. After logging into the AWS web console, do the following for each region you intend to use (you do just 'us-east-1' to start, since that has the example AMI):

  1. Switch to the region. (The second drop-down box in from the upper right.)
  2. Switch to CloudFormation. (In the Services menu, under Management.)
  3. Click the "Create Stack" button.
  4. Upload the template using the "Browse..." button
  5. Click the "Next" button.
  6. Name the stack; "HTCondorLeaseImplementation" is a good name.
  7. Click the "Next" button. (You don't to change anything on the options screen.)
  8. Check the box next to "I acknowledge" (down near the bottom) and click the "Create" button (where the "Next" button was).
  9. AWS should return the list of stacks; select the one you just created and select the "outputs" tab.
  10. Copy the long string labelled "LeaseFunctionARN"; you'll need it to configure condor_annex. It may take some time for that string to appear (you may need to reload the page, as well.) Wait the stack to enter the 'CREATE_COMPLETE' state before using the LeaseFunctionARN (see below).

Prepare the dynamic configuration machinery

For the same reason, you'll have to create a role for the annex instances, so they (but nobody else) can access the private S3 bucket. [FIXME: This should probably just be CF parameter?] Use the generate-role script, distributed FIXME, to create a CloudFormation template:

$ generate-role privateBucketName config.tar.gz > role.json

Create a stack by uploading role.json, but otherwise following the instructions from the previous section; the stack's output will be named "InstanceConfigurationProfile", and you'll need its value later.

Create Spot Fleet role

If this account you're using has never created a Spot Fleet, create one now:

  1. Switch to the region. (The second drop-down box in from the upper right.)
  2. Switch to EC2. (In the Services menu, under Compute.)
  3. Click on "Spot Requests" in the list on the left (under INSTANCES).
  4. Click the "Request Spot Instances" button.
  5. Click the "Next" button.
  6. [FIXME: automagic creating the IAM Fleet Role].

Create Security Group

You'll also need a security group that allows HTCondor (and SSH, just in case) through the firewall:

  1. Click on "Security Groups" (under NETWORK & SECURITY).
  2. Click the "Create Security Group" button.
  3. Enter a name; "HTCondorAnnexSG" is a good one.
  4. Enter a description; "Allows SSH and HTCondor from anywhere" would be accurate.
  5. Make sure that the "Inbound" tab (the default) is selected.
  6. Click the "Add rule" button. Change the left-most drop-down box from "Custom TCP Rule" to "SSH"; change the right-most drop-down box from "Custom" to "Anywhere". (This is less secure than it could be, but these instructions don't have room for a discussion about that.)
  7. Click the "Add rule" button again. This time, change the second text box to read "9618" (its column is labelled "Port Range"); also change the right-most drop-down box from "Custom" to "Anywhere".
  8. Click the "Create" button.
  9. You'll now see a list of security groups. The second column is the group name; find the name you entered when you created the group ("HTCondorAnnexSG") and record its security group ID (the column to the left).

Configure condor_annex

These instructions use an image published by the HTCondor developers to help people get started. The image's OS is Amazon Linux (based on CentOS 6). The example config.json [FIXME: Where?] uses that image and generates slots with 1 CPU and 2 GB of RAM using whatever instance type(s) happen to be cheapest at the time. If you want to tweak those values, read the AWS docs: http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_SpotFleetRequestConfigData.html. The example config.json bids the on-demand prices for its instance types. See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html for more information about Spot prices and bidding.

[We think we can automate the process of configuring HTCondor and its image down to just installed a special "EC2" RPM but the last attempt had a bug.]

Add the following lines to the HTCondor configuration:

/scratch/local/condor84/condor_config.local
ANNEX_DEFAULT_ACCESS_KEY_FILE = <path-to-access-key-file>
ANNEX_DEFAULT_SECRET_KEY_FILE = <path-to-secret-key-file>
ANNEX_DEFAULT_SPOT_FLEET_CONFIG_FILE = <path-to-default-config-file>

ANNEX_DEFAULT_LEASE_ARN = <LeaseFunctionARN>
ANNEX_DEFAULT_IAM_FLEET_ROLE_ARN = <IamFleetRoleARN>
ANNEX_DEFAULT_SECURITY_GROUP_ID = <SecurityGroupID>
ANNEX_DEFAULT_INSTANCE_ROLE_ARN = <InstanceRoleARN>
[This file is a lie. However, it shouldn't be hard to implement. It's just not wise, as written, in the long term.]