Ticket #2299: memory leaks in negotiator
I noticed an idiom used several times in the HGQ code to check if a parameter is defined:if (NULL != param_without_default("GROUP_QUOTA_MAX_ALLOCATION_ROUNDS"))
This leaks memory if the parameter is defined, because the return value of param_without_default() is not freed.
I suggest adding a new function param_is_defined(name). It could probably be implemented more efficiently than by calling param_without_default(), but until there is need to optimize, a simple implementation is probably best.
2011-Sep-06 15:42:51 by eje:
There is no functional change made by this fix. The negotiator should work exactly as before. I exercised the fix with the following functional test for HGQ (also attached as a tarball):
Configuration:NEGOTIATOR_DEBUG = D_FULLDEBUG SCHEDD_INTERVAL = 15 NEGOTIATOR_USE_SLOT_WEIGHTS = FALSE NUM_CPUS = 40 # turn on round-robin for this test, since we're testing # an overlapping-effective-pool scenario HFS_ROUND_ROBIN_RATE = 1.0 # turn on multiple allocation rounds, since we're testing # slot rejection scenario as well HFS_MAX_ALLOCATION_ROUNDS = 3 GROUP_NAMES = a, b, a.a, a.b, b.a, b.b GROUP_QUOTA_DYNAMIC_a = 0.5 GROUP_QUOTA_DYNAMIC_b = 0.5 GROUP_QUOTA_DYNAMIC_a.a = 0.5 GROUP_QUOTA_DYNAMIC_a.b = 0.5 GROUP_QUOTA_DYNAMIC_b.a = 0.5 GROUP_QUOTA_DYNAMIC_b.b = 0.5 # Have "a" and "b" subtrees manage their quota independently GROUP_AUTOREGROUP = TRUE GROUP_AUTOREGROUP_a = FALSE GROUP_AUTOREGROUP_b = FALSE
Submit file:universe = vanilla cmd = /bin/sleep arguments = 10m # group "a.a" should be able to use slots unused by "a.b" # set up requirements to stay out of lower slots (used by b.a and b.b) Requirements = (SlotID > 10) +AccountingGroup = "a.a.user" queue 25 # set up jobs for "a.b" to fail: Requirements = False +AccountingGroup = "a.b.user" queue 25 # set up "b.a" and "b.b" to compete Requirements = (SlotID <= 10) +AccountingGroup = "b.a.user" queue 25 +AccountingGroup = "b.b.user" queue 25
Expected negotiator results (copied from original test README)This test is designed to test behavior in two cases of negotiation failure: 1) A group cannot use all the slots it is allocated (in which case other should be able to use surplus) 2) Two (or more) groups are competing for same subset of slots, in which case neither should starve. Best to begin this by nuking your accounting log. Install "neg_failures.config" into your configuration, then start up condor. Submit "neg_failures.submit": % condor_submit neg_failures.submit Submitting job(s).................................................................................................... 100 job(s) submitted to cluster 2. You should see the following distribution: "b.a" and "b.b" are sharing their subset, and neither is starved. "a.a" is able to u se the surplus that "a.b" cannot use. (note, there are 10 of the total 40 slots unused here, by design): % qvhist AccountingGroup -c 'JobStatus == 2' 20 a.a.user 5 b.a.user 5 b.b.user 30 total
2011-Sep-23 15:25:10 by danb:
Code review: looks good. Thanks! --Dan
Type: defect Last Change: 2011-Sep-23 15:25 Status: resolved Created: 2011-Jul-13 10:11 Fixed Version: v070604 Broken Version: v070506 Priority: 4 Subsystem: Daemons Assigned To: eje Derived From: Creator: danb Rust: Customer Group: other Visibility: public Notify: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Due Date:
|2011-Sep-06 15:31||Check-in : Fixed a memory leak from some calls to param_without_default() by introducing a nicer abstraction param_defined(). ===VersionHistory:Complete=== ===GT:Fixed=== #2299 (By Erik Erlandson )|
- neg_failures_test.tar.gz 1221 bytes added by eje on 2011-Sep-06 20:38:50 UTC.
Tarball of the test referred to in the "repro/test" section below.