Ticket #2842: revamp job attributes, requirements for needs of partitionable slots

In order to know how to partition slots for jobs, the job needs to specify requirements for CPU,Memory,Disk and other scalar resources in form that is explicit. And the Execute nodes need to measure the actual use of memory and disk and publish values into the Job Ad so that the user can learn what their job requirements actually are.

The current system of using ImageSize a proxy for memory usage is too error prone. We need a better measure, and the execute node needs to able to choose what that better measure is. So we will define a new attribute in the job ad. MemoryUsage which will be set by the starter to the best measure available. It is anticipated that on most platforms this will result the starter setting MemoryUsed = ResidentSetSize.

In addition, condor_submit needs to be changed so that it no longer automatically adds expressions testing ImageSize into the Requirements expression. See subtickets for details of the various parts of this task.

Child Tickets
TN:
Developer # Status Type Pri Created Changed Target Due Date Days left Title
johnkn 2968 resolved defect 1 2012 May 2012 May v070800     submit should modify requirements when request_cpus is non-zero
smoler 2850 resolved enhance 2 2012 Feb 2012 Jun v070706     startd expressions to enable default rules on partitionable resources
johnkn 2835 resolved enhance 2 2012 Feb 2012 May v070706     revamp default job requirements esp given partitionable slots
johnkn 2856 resolved enhance 2 2012 Mar 2012 May v070706     add pow function to ClassAds
johnkn 2843 resolved enhance 3 2012 Feb 2012 Apr v070706     Have Starter set MemoryUsage into the job ad
tannenba 3286 new enhance 4 2012 Oct 2013 Apr v070906     Booting policy and swap detection
tannenba 2849 resolved enhance 4 2012 Feb 2012 May v070706     schedd should explicitly ask for partitionable resources
[Append remarks]

Remarks:

2012-Apr-05 10:48:06 by tstclair:
We've talked on a number of occasions about enumerating the test cases and verifying that everything is ok. Given that this is the parent ticket, howz about we do it here.


2012-Apr-25 11:03:55 by tstclair:
So the test case scenarios are:

state space:
000 - old submit, old schedd, old exec (doesn't matter)
001 - old submit, old schedd, new exec
010 - old submit, new schedd, old exec
011 - old submit, new schedd, new exec
100 - new submit, old schedd, old exec (not realistic)
101 - new submit(remote), old schedd, new exec
110 - new submit, new schedd, old exec
111 - all new.
[Append remarks]

Properties:

Type: enhance           Last Change: 2012-May-25 12:48
Status: resolved          Created: 2012-Feb-23 13:39
Fixed Version: v070706           Broken Version: v070705 
Priority:          Subsystem:  
Assigned To: johnkn           Derived From:  
Creator: johnkn  Rust:  
Customer Group: other  Visibility: public 
Notify: johnkn@cs.wisc.edu, tannenba@cs.wisc.ed, gthain@cs.wisc.edu, matt@cs.wisc.edu, tstclair@redhat.com  Due Date:  

Derived Tickets:

#2835   revamp default job requirements esp given partitionable slots
#2843   Have Starter set MemoryUsage into the job ad
#2849   schedd should explicitly ask for partitionable resources
#2850   startd expressions to enable default rules on partitionable resources
#2856   add pow function to ClassAds
#2968   submit should modify requirements when request_cpus is non-zero
#3286   Booting policy and swap detection