-Condor 7.8.4 contains a bug that causes the negotiator daemon to write malformed entries in its =Accountantnew.log=. When the negotiator reads the =Accountantnew.log= on startup, it will abort upon seeing one of these malformed entries. The master will periodically attempt to restart the negotiator, and each time, the negotiator will abort and exit. As a result, no jobs will run in the local pool, as no matches will be made. +*Do not upgrade from Condor 7.8.4 to 7.9.0.* -*Do not upgrade to Condor 7.8.4!* If you have already upgraded to Condor 7.8.4, this page describes how to get your pool working again once it's hit by this bug. +Condor 7.8.4 contains a bug that causes the negotiator daemon to write malformed entries in its =Accountantnew.log=. When a 7.8 negotiator reads the =Accountantnew.log= on startup, it ignores these entries. If any jobs complete while the negotiator wasn't running, their accumulated runtime won't be charged to their owner when calculating user priorities. -The following sections describe several different solutions. +But in Condor 7.9.0, the negotiator is less tolerant of malformed entries in the =Accountantnew.log=. It will abort upon seeing one. The master will periodically attempt to restart the negotiator, and each time, the negotiator will abort and exit. As a result, no jobs will run in the local pool, as no matches will be made. -{subsection: Upgrade} +Condor version 7.8.5 (and future releases) will not produce these malformed entries and will remove them from the =Accountantnew.log=. Condor version 7.9.1 (and future releases) will not abort when reading malformed entries and will remove them from the =Accountantnew.log=. -Condor 7.8.5 will include a fix where the negotiator no longer produces these entries and automatically fixes them when reading the =Accountantnew.log=. Once 7.8.5 is released, you can upgrade your central manager machine. No further action will be required. - -{subsection: Full Downgrade} - -You can downgrade your entire Condor installation to version 7.8.3 or earlier. If you installed the RPM package, you can do this by running =yum downgrade condor=. You only need to downgrade your Central Manager (the machine running the condor_negotiator daemon). The other machines in your Condor pool can be left running Condor 7.8.4. - -After the downgrade, you will need to clean your =Accountantnew.log= of any malformed entries. One option is to simply delete the file. If you do this, Condor will lose all knowledge of past resource usage by users and user priorities set by =condor_userprio=, which is used by the negotiator when determining how many machines each user should be allocated to run their jobs. - -Instead of deleting =Accountantnew.log=, you can correct the malformed entries by running this command on unix: -{code} -% sed -i -e 's/< /</g' -e 's/ , /,/g' -e 's/ >/>/g' `condor_config_val SPOOL`/Accountantnew.log -{endcode} - -{subsection: Partial Downgrade} - -You can downgrade just your negotiator daemon to Condor 7.8.3, which does not contain this bug, and leave the rest of your Condor installation at 7.8.4. This option is more complicated, but may be useful if you need some of the bug fixes in 7.8.4. - -Here are the steps you'll need to take: -*: Download Condor 7.8.3 from the Condor downloads page (http://research.cs.wisc.edu/condor/downloads-v2/download.pl). - -*: Extract the 7.8.3 package - -*: Copy the =condor_negotiator= binary into the Condor SBIN directory with the name =condor_negotiator.783= - -*: Copy the =libcondor_utils_7_8_3.so= library into the Condor LIB directory - -*: Edit your configuration file to add this line: =NEGOTIATOR=$(SBIN)/condor_negotiator.783= - -*: Restart the Condor daemons - -*: Clean the =Accountantnew.log= as described in the previous section - -Here is a sample set of unix commands to use if you installed Condor from the tarball package into =/opt/condor=: - -{code} -% tar xvf condor-7.8.3-x86_64_rhap_6.3-stripped.tar.gz -% cp condor-7.8.3-x86_64_rhap_6.3-stripped/sbin/condor_negotiator /opt/condor/sbin/condor_negotiator.783 -% cp condor-7.8.3-x86_64_rhap_6.3-stripped/lib/libcondor_utils_7_8_3.so /opt/condor/lib -% vi /opt/condor/local.foo/condor_config.local -# Add 'NEGOTIATOR=$(SBIN)/condor_negotiator.783' -% condor_reconfig -% condor_off -negotiator -% sed -i -e 's/< /</g' -e 's/ , /,/g' -e 's/ >/>/g' /opt/condor/local.foo/spool/Accountantnew.log -% condor_on -negotiator -% rm -rf condor-7.8.3-x86_64_rhap_6.3-stripped -{endcode} - -Here is a sample set of unix commands to use if you installed Condor from the RPM package on a 64-bit machine: - -{code} -% mkdir /tmp/783 -% cd /tmp/783 -% rpm2cpio /tmp/condor-7.8.3-62667.rhel6.3.x86_64.rpm | cpio -idmv -% cp usr/sbin/condor_negotiator /usr/sbin/condor_negotiator.783 -% cp usr/lib64/condor/libcondor_utils_7_8_3.so /usr/lib64/condor -% vi /etc/condor/condor_config.local -# Add 'NEGOTIATOR=$(SBIN)/condor_negotiator.783' -% condor_reconfig -% condor_off -negotiator -% sed -i -e 's/< /</g' -e 's/ , /,/g' -e 's/ >/>/g' /var/lib/condor/spool/Accountantnew.log -% condor_on -negotiator -% cd .. -% rm -rf 783 -{endcode} - -Note: Once you upgrade Condor to 7.8.5 or beyond, you'll want to remove the =condor_negotiator.new= and =libcondor_utils_7_8_3.so= files, and remove the added NEGOTIATOR line from the config file. +Thus, upgrading from Condor 7.8.4 to 7.9.0 will render your pool unable to run jobs. But any other upgrade path is unaffected.