Page History
Mark Coatsworth's Braindump
Contact: mark.coatsworth@gmail.com
HTCondor Week
The vast majority of information needed for HTCondor Week planning is in the following document: https://docs.google.com/document/d/16NUrbKkC9AVuysaLBkIYn0XZBP9aSw3ubseNaneKH8I/edit#
You'll need access to the following resources:
- Google Drive: https://drive.google.com/drive/folders/0AFWs9GYd6arpUk9PVA
- This contains all the documents used for previous iterations of HTCondor Week.
- Especially useful: each folder contains a document called HTCondor Week 20## Email Announcements with all the copy used that year. Much of this can be reused.
- Indico Site: https://agenda.hep.wisc.edu/category/34/
- Every year we make a new event called HTCondor Week 20##.
- Anybody publishing information on here will require Manager access. Currently only Mark Coatsworth, Jason Patton and Todd Tannebaum are managers.
- Registration page: all registrations and payments come in via https://charge.wisc.edu/condor/register.aspx
- Updates and changes to this page are managed by Eric Simons (eric.simons@wisc.edu)
- Backend interface to see registration data: https://charge.wisc.edu/wisccharge/index.aspx
- Mailing Lists: all email announcements should get sent to the following lists. The Google doc mentioned at the top contains the admin passwords for these lists.
- htcondor-users@cs.wisc.edu
- htcondor-friends@cs.wisc.edu (https://lists.cs.wisc.edu/mailman/admin/htcondor-friends/members/list)
- htcondor-world@cs.wisc.edu (https://lists.cs.wisc.edu/mailman/admin/htcondor-world/members/list)
- Additionally, we maintain a mailing list at htcondor-week@cs.wisc.edu for all incoming email related to HTCondor Week (https://lists.cs.wisc.edu/mailman/admin/htcondor-week/members/list)
- Surveys: https://uwmadison.co1.qualtrics.com/homepage/ui
PATh Metrics
- All information related to PATh metrics reporting is on a Wiki Page: https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=PrivatePathMetricsReporting
DAGMan
- DAGMan recently shipped two major updates:
- All
MyString
instances have been replaced withstd::string
- If this results in any bugs, there is no workaround. You'll need to find the bug and fix it. It will help to look at an older version of the code to see what the MyStrings are doing (ie. https://github.com/htcondor/htcondor/tree/V9_5_0-branch)
- DAGMan now submits jobs by writing them directly to the condor_schedd, instead of forking a condor_submit process.
- If this results in any bugs, you can revert to the old behavior by setting
DAGMAN_USE_DIRECT_SUBMIT = False
- If this results in any bugs, you can revert to the old behavior by setting
- All
- Otherwise, DAGMan is in a steady state. There aren't any big projects in active development.
- Some more detailed notes are available in the following document: https://docs.google.com/document/d/1f0vc2DTJvH8wQLa87gcFHy0Tpd86cyTweD2-GmxfD8o/edit
LIGO
- Outstanding Jira tickets:
- Better client tool support for flocked environments (HTCONDOR-197)
- This is an umbrella ticket for a bunch of requests related to providing better monitoring tools for jobs running in flocked environments. Some requests are straightforward but others look very complex.
- Since LIGO uses DAGMan for all their workflows, we figured a good starting point was adding DAGMan integration to the htcondor CLI tool (HTCONDOR-929)
- Assist LIGO in transition from shared file systems to HTCondor file transfer (HTCONDOR-475)
- This work is blocked, waiting on better support from condor_ssh_to_job.
- Better client tool support for flocked environments (HTCONDOR-197)
- gwdata_plugin is a file transfer plugin used to retrieve files referenced by LIGO's gwdatafind service.
- Source code is here: https://github.com/htcondor/htcondor/blob/master/src/condor_scripts/gwdata_plugin.py
- LIGO's gwdatafind client tool: https://git.ligo.org/computing/gwdatafind/client
- Plugin needs to run on an execution point with python3 and the
classad
package installed.
MathWorks
- I've been working on a Docker container which allows us to use the MATLAB runtime anywhere on CHTC.
- Source files for this Docker container are in: moria.cs.wisc.edu:/nobackup/coatsworth/docker/matlab-chtc
- A pre-built image is available here: dockerreg.chtc.wisc.edu/markcoatsworth/matlab-chtc
- I've set up a MATLAB Parallel Server environment on CHTC at the following location: submit5.chtc.wisc.edu:/home/coatsworth/MATLAB/R2021b/bin
- In order to use this environment, you need to remove sudo access from your account so that you can ssh into submit5 with going through Duo 2FA.
- A general brain dump document from all my calls with MathWorks is here: https://docs.google.com/document/d/18ELXUnpN2SaUazJp0OecguzmeCUqzKgx6Kz5-9T9seQ/edit
Pegasus
- I've been working on a Docker container which allows us to submit Pegasus workflows from CHTC.
- Source files for this container are in: moria.cs.wisc.edu:/nobackup/coatsworth/docker/pegasus-chtc
- These is currently one major limitation with this container. It can successfully submit Condor jobs via the condor_schedd running on the host machine, however the Pegasus binaries needed to run workflows are not available on the host. We currently have no solution for this. A couple possible approaches:
- Make it a requirement that host machines have the Pegasus binaries available somewhere.
- Add an
entrypoint.sh
file to the Docker container that copies the files to the user's home directory.
- A Google Doc explaining how to install Pegasus directly on CHTC access points and submit workflows is available here: https://docs.google.com/document/d/1xnKA3BSOx3QunLyPjNj1onXZ-opYMj4C_vKg7yHJwEs/edit
Swag
- Stickers:
/p/condor/public/swag