Page History
- 2012-Nov-13 16:47 adesmet
- 2011-Aug-24 14:14 adesmet
- 2010-Mar-30 13:48 adesmet
- 2010-Mar-30 13:30 adesmet
- 2009-Nov-23 11:24 vmathew
- 2009-Oct-21 12:20 vmathew
- 2009-Oct-14 21:16 vmathew
- 2009-Oct-09 12:45 vmathew
- 2009-Oct-08 14:57 vmathew
- 2009-Sep-29 10:58 adesmet
- 2009-Sep-29 10:57 adesmet
- 2009-Sep-21 15:26 tannenba
- 2009-Sep-17 11:06 tannenba
- 2009-Sep-15 13:34 jfrey
- 2009-Aug-17 10:33 gthain
- 2009-Aug-04 12:03 adesmet
- 2009-Jul-13 09:27 gthain
- 2009-Jul-07 16:45 tannenba
- 2009-Jun-23 13:13 adesmet
- 2009-May-28 13:34 tannenba
- 2009-May-28 13:30 burnett
- 2009-May-28 13:26 burnett
- 2009-May-20 13:08 tannenba
- 2009-May-19 15:20 tannenba
- 2009-Feb-23 14:17 psilord
- 2009-Feb-19 10:44 psilord
- 2009-Jan-26 11:39 tannenba
Student Project Ideas
This page contains an unsorted list of brainstorm ideas that could be good candidates for student hourly work.
This search may turn up other tasks
- I think it will be useful to start a document that records wisdom/insight about the context in which expressions are evaluated in Condor. A good assignment for a student would be to go through the expressions we have and record the context in which they are evaluated. -Miron 9/21/09
- #173 Have a vanilla job "flavor" for running Excel as a job on Win32. (talk to Todd or Ben if you don't understand this) (Current Status).
- #617 Use getopt() everywhere to parse creepy command line options consistently! Whoot!
- Go through all the Coverity bugs and fix them (roughly 1200 bugs at last count)
- Add support for standard universe to the new starter/shadow, then dump the old one. This will remove all kinds of inconsistencies between vanilla and standard universe, and speed up future development, as we won't have to add code in multiple places. (NOTE: perhaps this isn't such a great idea anymore if DMTCP adoption goes as hoped)
- There's a number of cases where condor_q -analyze doesn't know why jobs can't run. (negotiation cycle hasn't happened yet, job retirement in progress, parallel universe jobs, etc.) It would be nice to add smarts to -analyze to catch these cases.
- Add stress tests to the Condor test suite. It would be great if we had tests that stressed the individual daemons, without trying to run jobs through the system. This way, we'd find memory leaks and other embarrassing problems before out users did.
- #596, #597 Currently, CondorView is difficult to install and maintain. This is because it is written in perl, java, javascript and C++. I believe we could implement something much better in javascript + SOAP to the collector, and have it installed by default with Condor. This would just work without any tinkering around, installing apache, downloading jar files or anything other than just a normal Condor install.
- Make metronome scalable. There's been a bunch of times where a test fails intermittently. We've got plenty of machines. I'd love to be able to submit 100 test runs overnight and look at the results in the morning. Currently, metronome can't do this.
- Add more effort to writing tests, incl unit tests. I.e. more than just Bill.
- Cool/interesting Job Sandbox transfer plugins: BitTorrent, gridftp, ...
- Work on the Stork supplement work.
- Help wrapping new classads w/ old classad API, and/or make a shared library.
- Add private-to-private proxying in the CCB broker.
- Explore use of Parrot w/ Condor.
- Explore a secure Condor setup by default, with use of MiniCA and/or online CA.
- Super CHTC turnkey package for installation on Windoze machines around campus, using combo of VirtualBox + Condor, in collaboration with Marquette Univ and/or Purdue.
- Quill to MySQL (or SQLite?)
- Test Condor w/ out of disk space.
- Get a distcc setup going, put some abstractions into Metronome so you can automatically use it
- There are some memory-testing scripts running on GLOW using Hawkeye/startd cron. Include those in the default install of Condor
- Do a clipped port to Solaris 10/x86
- Prototype a 'condor_job_backup' command (not sure you can finish this in a few weeks)
- a .NET/Managed code universe. Run the same job on the CLR on Win32 or on Mono (or, think real hard about how to abstract the Java universe, .Net universe, Python universe, etc)
- Replace the checkpoint server protocol with HTTP GETs and PUTs
- Create a couple of "Condor Technical Articles", in the 5-10 page range. Some ideas:
- A document exploring all of the issues involved in running Condor in a root squash environment. Document all of the different user ids Condor uses and what it does while it is in each of those UIDs
- A document that explores Condor and DNS - figure out all of the places Condor requires a working DNS, look at where it doesn't need it, and what you can override. Don't be afraid to start at square one with these documents, and explain what root squash is or what reverse DNS is.
- #152 move git to pinguino
- #653: Report operating system, distribution, and version in world ads - Lots of straightforward work detecting different distributions.
- #789: Grind into Coverity defect reports - Lots of straightforward work reaching for low-hanging fruit.
- Whenever a user reports a problem or a bug, there's a standard bunch of questions we have: what are your log files, configuration, condor_q -analze, etc. The VDT has a nice script which automatically discovers a bunch of stuff about a system, and generates a report the user can mail in with their bug report. We should do something similar for Condor, and ask the VDT people if we can integrate our script into their, or if we can share parts of their script.
- Remove code related to CVS and the defunct git repositories (CONDOR_EXT, CONDOR_TEST_LRG, CONDOR_TEST_CNFDTL) from Condor's NMI build scripts.
- Seamless network migration for virtual machines.
If a virtual machine is using networking, and it gets checkpointed and migrated, networking can go disconnected. Likewise, allowing in-bound connections is a tricky problem with virtual machines. This is an early stage design proposal I have. It consists of two aspects, a homing daemon, and tap interfaces on execute hosts.
-homing daemon basically is a condor job which runs as NAT. These condor jobs can execute only on select hosts (as set up by the pool admin), maybe marked by a specific attribute.(problem : what if the homing daemon is checkpointed and migrated ? oops!)
- on the execute host for virtual machines, the nic of the virtual machine should be connected to a tap interface on the host. The starter sets up a tunnel connecting this tap interface and the homing daemon.
- when a virtual machine job is matched, the negotiator fires a homing daemon job (unless one's already running), and passes the configs to the starter / adds it in the classad.
- load balancing can be done by starting more homing daemons when more vm jobs are encountered.
- a dhcp server could also be run as a part of the homing daemon. MAC addresses could be made a part of the vmuniverse job. Thus if a vm job checkpoints and migrates, when it comes up again, it will get the same ip.
- the homing daemon could just run as a switch. And vmuniverse jobs can specify a vlan. In this case, homing daemons can be used to achieve intranetworking for vm universe jobs.
(vmathew) - create virtual machine add ons (guest additions), for the standard vm hypervisors (vmware, virtualbox) etc, to communicate to the condor instance on the execute hosts. A condor-guest-addition.iso cd image which will install (or maybe it's okay to require to compile and install) a service inside the guest (so we need support for common guest operating systems).
Maybe, if we can use these guest additions to allow vms to communicate to the outside world via condor api, then we may not need to support more than NAT networking. Depends fully on how rich the guest additions can make networking.
(vmathew) - Currently, the architecture of the submit machine is taken as the architecture of the required VM Universe jobs. However, if I create a 32 bit vm on a 64 bit machine, then, that VM can as well run on a 32 bit machine with vmware. Yet, the architecture requirement prevents such matches. This needs to be fixed.
(vmathew) - I see a lot of D_ALWAYS used in the vm universe code. Evaluate and change these to something appropriate.. Maybe create a debugging mode for vm universe ?? (vmathew)