Page History
- 2017-Oct-31 13:28 tlmiller
- 2017-Aug-16 18:52 tlmiller
- 2017-Jun-22 13:29 tlmiller
- 2017-Jun-22 13:27 tlmiller
- 2017-Jun-22 13:23 tlmiller
- 2017-Apr-26 15:24 tlmiller
- 2017-Apr-26 15:23 tlmiller
- 2017-Mar-17 13:20 tlmiller
- 2017-Mar-01 21:24 tlmiller
- 2017-Feb-24 10:53 tlmiller
- 2017-Feb-21 17:34 tlmiller
- 2017-Feb-16 14:18 tlmiller
- 2017-Feb-16 14:07 tlmiller
- 2017-Feb-16 14:06 tlmiller
- 2017-Feb-15 11:18 tlmiller
- 2017-Feb-13 18:14 tlmiller
- 2017-Feb-13 18:13 tlmiller
- 2017-Feb-10 14:44 tlmiller
- 2017-Feb-07 13:08 tlmiller
- 2017-Feb-06 15:27 tlmiller
- 2017-Feb-03 16:49 tlmiller
- 2017-Feb-03 14:13 tlmiller
- 2017-Feb-03 14:10 tlmiller
- 2017-Feb-03 14:09 tlmiller
- 2017-Feb-03 14:08 tlmiller
- 2017-Feb-03 14:08 tlmiller
- 2017-Feb-03 13:18 tlmiller
- 2017-Feb-03 11:37 tlmiller
- 2017-Feb-03 11:35 tlmiller
- 2017-Feb-03 11:33 tlmiller
- 2017-Feb-03 11:13 tlmiller
- 2017-Feb-03 11:09 tlmiller
- 2017-Feb-03 11:08 tlmiller
- 2017-Feb-03 10:22 tlmiller
- 2017-Feb-03 10:21 tlmiller
- 2017-Feb-02 16:42 tlmiller
- 2017-Feb-02 16:30 tlmiller
- 2017-Feb-02 15:34 tlmiller
- 2017-Feb-02 15:32 tlmiller
- 2017-Feb-02 15:32 tlmiller
- 2017-Feb-01 18:26 tlmiller
- 2017-Feb-01 18:24 tlmiller
- 2017-Feb-01 18:14 tlmiller
- 2017-Feb-01 18:07 tlmiller
This guide assumes that you already have an AWS account, as well as a log-in account on a Linux machine with a public address and a system administrator who's willing to open a port for you. All the terminal commands (shown on a grey background) and file edits (shown on a green background) take place on the Linux machine. You can perform the web-based steps from wherever is convenient, although it will save you some copying if you can run a browser on the Linux machine.
Before using condor_annex for the first time, you'll have to do three things:
- install a personal Condor
- prepare your AWS account
- configure condor_annex
Instructions for each follow.
Install a personal Condor
We recommend that you install a personal condor to make use of condor_annex; it's simpler to configure that way.
These instructions assume that it's OK to create a directory named condor-8.7.0 in your home directory; adjust accordingly if you want to install HTCondor somewhere else.
Start by downloading the 8.7.0 release from the "tarballs" section that matches your Linux version. (If you don't know your Linux version, ask your system administrator.) These instructions assume that the file you downloaded is located in your home directory on the Linux machine, so copy it there if necessary.
Then do the following:
$ mkdir ~/condor-8.7.0; cd ~/condor-8.7.0; mkdir local $ tar -z -x -f ~/condor-8.7.0-*-stripped.tar.gz $ ./condor-8.7.0-*-stripped/condor_install --local-dir `pwd`/local --make-personal-condor $ . ./condor.sh $ condor_master
Testing 
Give HTCondor a few seconds to spin up and the try a few commands to make sure the basics are working. Your output will vary depending on the time of day, the name of your Linux machine, and its core count, but it should generally be pretty similar to the following.
$ condor_q - Schedd: submit-3.batlab.org : <127.0.0.1:12815?... @ 02/03/17 13:57:35 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended $ condor_status -any MyType TargetType Name Negotiator None NEGOTIATOR Collector None Personal Condor at 127.0.0.1@submit-3.bat Machine Job slot1@submit-3.batlab.org Machine Job slot2@submit-3.batlab.org Machine Job slot3@submit-3.batlab.org Machine Job slot4@submit-3.batlab.org Machine Job slot5@submit-3.batlab.org Machine Job slot6@submit-3.batlab.org Machine Job slot7@submit-3.batlab.org Machine Job slot8@submit-3.batlab.org Scheduler None submit-3.batlab.org DaemonMaster None submit-3.batlab.org Accounting none <none>
You should also try to submit a job; create the following file
~/condor-annex/sleep.submit
executable = /bin/sleep arguments = 600 queue
and submit it:
$ condor_submit ~/condor-annex/sleep.submit Submitting job(s). 1 job(s) submitted to cluster 1. $ condor_reschedule
After a little while:
$ condor_q -- Schedd: submit-3.batlab.org : <127.0.0.1:12815?... @ 02/03/17 13:57:35 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS tlmiller CMD: /bin/sleep 2/3 13:56 _ 1 _ 1 3.0 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
Configure public interface
The default personal Condor uses the "loopback" interface, which basically just means it won't talk to anyone other than itself.  For condor_annex to work, your personal condor needs to use the Linux machine's public interface.  In most cases, that's as simple as adding the following lines to ~/condor-8.7.0/local/condor_config.local.
~/condor-8.7.0/local/condor_config.local
NETWORK_INTERFACE = * CONDOR_HOST = $(FULL_HOSTNAME)
Restart HTCondor to force the changes to take effect:
$ condor_restart Sent "Restart" command to local master
Repeat the steps under "Testing" to make sure that this configuration works for you, and then proceed onto the next section.
Configure a pool password
In this section, you'll configure your personal Condor to use a pool password. This is a simple but effective method of securing Condor's communications to AWS.
Add the following lines to ~/condor-8.7.0/local/condor_config.local.
~/condor-8.7.0/local/condor_config.local
SEC_PASSWORD_FILE = $(LOCAL_DIR)/condor_pool_password SEC_DAEMON_INTEGRITY = REQUIRED SEC_DAEMON_AUTHENTICATION = REQUIRED SEC_DAEMON_AUTHENTICATION_METHODS = PASSWORD SEC_NEGOTIATOR_INTEGRITY = REQUIRED SEC_NEGOTIATOR_AUTHENTICATION = REQUIRED SEC_NEGOTIATOR_AUTHENTICATION_METHODS = PASSWORD SEC_CLIENT_AUTHENTICATION_METHODS = FS, PASSWORD ALLOW_DAEMON = condor_pool@*
You also need to run the following command, which prompts you to enter a password:
$ condor_store_cred -c add -f `condor_config_val SEC_PASSWORD_FILE` Enter password:
Enter a password.
(For more details, see HowToEnablePoolPassword.)
Tell HTCondor about the open port
By default, HTCondor will use port 9618.  If the Linux machine doesn't already have HTCondor installed, and the admin is willing to open that port, then you don't have to do anything.  Otherwise, you'll need to add a line like the following to ~/condor-8.7.0/local/condor_config.local, replacing '9618' with whatever port the administrator opened for you.
~/condor-8.7.0/local/condor_config.local
COLLECTOR_HOST = $(FULL_HOSTNAME):9618
Activate the new configuration
Force HTCondor to read the new configuration by restarting it:
$ condor_restart
Prepare your AWS account
The current version of condor_annex still needs a little help from you to do its job.  There are five pieces that need to be placed in the cloud; we provide three of the pieces, but you need to put them in place for us.  (AWS will create the other two for you when you ask.)  Instructions for each of these pieces follows; don't worry if you don't know what any of them mean or do -- the instructions will explain what you need to know.
- A (private) S3 bucket
- An EC2 instance profile
- An AWS Lambda function
- A security group
- An SSH key pair.
We'll be using the "us-east-1" region throughout.
Create a (private) S3 bucket
An S3 bucket is a place in the AWS cloud where you can store files.  condor_annex stores the dynamic configuration the instances in your annex will need in an S3 bucket.  If the bucket is private, than only you can read the files in it -- allowing your instances, and only your instances, to read those files is what the next step is for.  These two steps make it possible for condor_annex to securely share the password you entered earlier.
To create an S3 bucket, go to the S3 console; log in if you need to. Then:
- Click the "Create Bucket" button.
- Enter a name at the prompt.  Amazon makes this harder than it needs to be by requiring that the bucket name is unique.  A name like 'annex-<username>-<year>-<month>-<day>' (where you replace <things> with their actual values) has a good chance of being unique; condor_annexdoes not require a particular style of name. Where you sees3PrivateBucketin the following instructions, replace it with the name you entered here.
- Select the "US Standard" region.
- Click the "Create" button.
Thankfully, the default for newly-created bucket is to be private.
Create an EC2 instance profile
On EC2, an "instance profile" is a way to associate a "role" with an instance. A "role" is collection of privileges that the instance would not otherwise have. Specifically, each annex instance needs to have the privilege to download its dynamic configuration from the otherwise-private S3 bucket you just created. We've created a CloudFormation template which creates the instance profile and role for you.
To create the instance profile, go to the CloudFormation console; log in if you need to. Then:
- Click the "Create Stack" button.
- Choose "Specify an Amazon S3 template URL" and enter "https://s3.amazonaws.com/condor-annex/role-6.json".
- Click the "Next" button.
- Name the stack; "HTCondorAnnexInstanceProfile" is a good name.
- Type s3PrivateBucketin the field labeled "S3BucketName".
- Click the "Next" button.
- You don't need to change anything on this (the "Options") page. Click the "Next" button.
- Check the box next to "I acknowledge" (down near the bottom) and click the "Create" button (where the "Next" button was).
- AWS will display a list of stacks; wait for the one you just created (it may be the only one) to have "CREATE_COMPLETE" in the column labelled "Status". You may need to refresh (use the circling-arrow button) to update the list.
- Select the stack you just created (click on the status rather than the name), and then select the "Outputs" tab.
- Copy the long string labelled "InstanceProfileARN"; where you see InstanceProfileARNin the following instructions, replace it with the string you copied here.
Create an AWS Lambda function
An AWS Lambda function is a way of running (usually small snippets of) code on AWS without starting an instance.  condor_annex uses this ability to ensure that the duration you specify when starting an annex is not exceeded, even if the Linux machine is longer running when the lease expires.  We've created a CloudFormation template which creates and configures the Lambda function for you.
To create the instance profile, go to the CloudFormation console; log in if you need to. Then (these instructions should look familiar):
- Click the "Create Stack" button.
- Choose "Specify an Amazon S3 template URL" and enter "https://s3.amazonaws.com/condor-annex/template-6.json".
- Click the "Next" button.
- Name the stack; "HTCondorAnnexLambdaFunction" is a good name.
- Click the "Next" button.
- You don't need to change anything on this (the "Options") page. Click the "Next" button.
- Check the box next to "I acknowledge" (down near the bottom) and click the "Create" button (where the "Next" button was).
- AWS will display a list of stacks; wait for the one you just created to enter the "CREATE_COMPLETE" state. You may need to refresh (use the circling-arrow button) to update the list.
- Select the stack you just created (click on the status rather than the name), and then select the "Outputs" tab.
- Copy the long string labelled "odiLeaseFunctionARN"; where you see odiLeaseFunctionARNin the following instructions, replace it with the string you copied here.
- Copy the long string labelled "sfrLeaseFunctionARN"; where you see sfrLeaseFunctionARNin the following instructions, replace it with the string you copied here.
Create a security group
On AWS, a "security group" is a set of firewall rules; it defines which machines can use the services your instance may provide. Specifically, for the annex to work, you'll need to allow (at least) the Linux machine you're using to connect to the instances' HTCondor service. (These instructions include an open port for SSH, as well, just in case that becomes necessary.) For simplicity, these instructions allow connections to the SSH and HTCondor services from anywhere. If you know the public IP address of the Linux machine you're using, you can enter that address instead of selecting "Anywhere" in those two steps (select "Custom" and type the address in the box).
To create the security group, go to the security group console; log in if you need to. Then:
- Click the "Create Security Group" button.
- Enter a name; "HTCondorAnnexSG" is a good one.
- Enter a description; "Allows SSH and HTCondor from anywhere" would be accurate.
- Make sure that the "Inbound" tab (the default) is selected.
- Click the "Add rule" button. Change the left-most drop-down box from "Custom TCP Rule" to "SSH"; change the right-most drop-down box from "Custom" to "Anywhere".
- Click the "Add rule" button again. This time, change the second text box to read "9618" (its column is labelled "Port Range"); also change the right-most drop-down box from "Custom" to "Anywhere".
- Click the "Create" button.
You'll now see a list of security groups.  The second column is the group name; find the name you entered when you created the group ("HTCondorAnnexSG") and record its security group ID (the column to the left).  Where you see SecurityGroupID in the following instructions, replace it with the ID you recorded here.
Create an SSH key pair
You'll only need this if something goes wrong, but in that case, you'll need it really badly. An SSH key pair allows you to log in (via SSH) to an instance started with the specified key pair. (You probably used SSH to log in to the Linux machine you've been using.)
Go to the EC2 console; log in if you need to. Then:
- Click the "Create Key Pair" button.
- Name the key pair "annex-key-pair".
- Click the "Create" button.
- Your browser will automatically prompt you about downloading a file called "annex-key-pair.pem". Save this file somewhere private.
If you transfer that file to the Linux machine, you can run ssh -i annex-key-pair.pem ec2-user@instanceAddr to log in; non-Linux SSH clients should be able to use the .pem file (and log in as 'ec2-user') as well, but those instructions are out of scope for this document.  (You can obtain an instance's address from the EC2 console.)
Configure condor_annex
Almost there!  You need to get one more pair of things from AWS before you can fully configure condor_annex.
Obtaining an Access Key
In order to use AWS, condor_annex needs a pair of security tokens (like a user name and password).  Like a user name, the "access key" is (more or less) public information; the corresponding "secret key" is like a password and must be kept a secret.  To help keep both halves secret, condor_annex (and HTCondor) are never told these keys directly; instead, you tell HTCondor which file to look in to find each one.
Create those two files now; we'll tell you how to fill them in shortly. The following is a reasonable way to do so. If you'd prefer they'd be in a particular directory, feel free to change the first line.
$ cd $ touch accessKeyFile $ touch secretKeyFile $ chmod 600 accessKeyFile secretKeyFile $ pwd
The second-to-last command ensures that only you can read or write to those files.  When you see path/to/accessKeyFile or path/to/secretKeyFile in the following instructions, replace path/to with the line printed by the last command.
To donwload a new pair of security tokens for condor_annex to use, go to the IAM console; log in if you need to.  The following instructions assume you are logged in as a user with the privilege to create new users.  (The 'root' user for any account has this privilege; other accounts may as well.)
- Click the "Add User" button.
- Enter name in the User name box; "annex-user" is a fine choice.
- Click the check box labelled "Programmatic access".
- Click the button labelled "Next: Permissions".
- Select "Attach existing policies directly".
- Type "AdministratorAccess" in the box labelled "Filter".
- Click the check box on the single line that will appear below (labelled "AdministratorAccess").
- Click the "Next: review" button (you may need to scroll down).
- Click the "Create user" button.
- From the line labelled "annex-user", copy the value in the column labelled "Access key ID" to accessKeyFile.
- On the line labelled "annex-user", click the "Show" link in the column labelled "Secret access key"; copy the revealed value to secretKeyFile.
- Hit the "Close" button.
The 'annex-user' now has full privileges to your account.  We're working on creating a CloudFormation template that will create a user with only the privileges condor_annex actually needs.
putting it all together
Add the following lines to the LOCALDIR/condor_config.local:
LOCALDIR/condor_config.local
# These following lines are common to all accounts, and are included # here in case you want to try using a different region. You'll also # have to change the default AMI, or specify a different one when # invoking condor_annex. ANNEX_DEFAULT_EC2_URL = https://ec2.us-east-1.amazonaws.com ANNEX_DEFAULT_CWE_URL = https://events.us-east-1.amazonaws.com ANNEX_DEFAULT_LAMBDA_URL = https://lambda.us-east-1.amazonaws.com ANNEX_DEFAULT_S3_URL = https://s3.amazonaws.com # All subsequent lines are specific to your particular account. ANNEX_DEFAULT_ACCESS_KEY_FILE = path/to/accessKeyFile ANNEX_DEFAULT_SECRET_KEY_FILE = path/to/secretKeyFile # The following lines configure the image and type of on-demand instance # that condor_annex will use if you don't specify otherwise. Note that # the default image ID set here is for us-east-1, so if you've changed # the region-specific URLs above, you'll need to change the ID here, too. ANNEX_DEFAULT_ODI_INSTANCE_TYPE = m4.large ANNEX_DEFAULT_ODI_IMAGE_ID = ami-206ab336 ANNEX_DEFAULT_ODI_INSTANCE_PROFILE_ARN = instanceProfileARN ANNEX_DEFAULT_ODI_S3_CONFIG_PATH = s3PrivateBucket/config.tar.gz ANNEX_DEFAULT_ODI_LEASE_FUNCTION_ARN = odiLeaseFunctionArn ANNEX_DEFAULT_ODI_KEY_NAME = annex-key-pair ANNEX_DEFAULT_ODI_SECURITY_GROUP_IDS = securityGroupID ANNEX_DEFAULT_SFR_LEASE_FUNCTION_ARN = sfrLeaseFunctionArn
You're ready to run condor_annex!  Return to HowToUseCondorAnnexWithOnDemandInstances.
