Ticket #2096: Condor should rebuild environment post-gLexec

gLexec is invoked by one UID (the glexecer), maps a new UID based on the proxy, and changes to a different UID (the glexecee).

gLexec itself is known to strip out the environment. For glideins with

JOB_INHERITS_STARTER_ENVIRONMENT = True
Condor should rebuild the environment, in effect passing the glexecer environment down to the glexecee.
[Append remarks]

Remarks:

2011-Apr-26 16:29:25 by burt:
Actually, Igor thinks it's only the PATH environment variable that is not getting properly rebuilt and the rest of the environment might be ok.


2011-May-23 14:58:36 by danb:
Do we know if there is a good reason for glexec to mess with PATH? If yes, perhaps we should think twice? If no, can glexec be fixed?

I'm not trying to be lazy here (yet). I'm just curious what the thinking is.


2011-May-23 15:25:13 by danb (for Igor):
We have actually implemented a workaround that passes the PATH in a different env variable, and restores it in the job wrapper. So, from a practical point of view, we are happy in propagating the PATH.

But if any security expert can tell us why this is bad, I would be interested to hear it.


2011-May-23 15:29:00 by danb:
Should we approach the glexec developers and ask them to make this change? Implementing a hack in Condor to work around this seems appropriate if necessary. I'm just still hoping it's not necessary.


2011-May-23 16:00:00 by danb (for Igor):
I would be surprised if it was a glexec problem. glexec already destroys all the environment.

Condor then reconstructs it. The question is: Why it reconstructes everything but PATH?


2011-Jul-19 17:07:50 by danb:
I discovered today that when condor reconstructs the job environment, it overrides all variables in the job environment with any variables set by glexec. PATH is one of the variables set by glexec. That explains the behavior you observed.

I am inclined to reverse the order of precedence. If there is a conflict between intended job environment and environment set by glexec, we should use the job environment.

I am a little unsure about whether to make this change in the stable series or not. It could be considered a behavioral change or a bug fix. I have to make this change at least for X509_USER_PROXY in the stable series in order to be able to have proxy updates take effect when using glexec 0.7+. I suppose we could be conservative and just treat PATH specially as well, rather than changing the precedence of all variables.

With glexec 0.6.8-3, I see the following environment from within a process launched by glexec:

PATH=/usr/local/bin:/usr/bin:/bin
GLEXEC_CLIENT_CERT=/tmp/x509up_u275
HOME=/osghome/osg_cmsuser01
LCMAPS_LOG_LEVEL=1
LCMAPS_DEBUG_LEVEL=0
LCMAPS_DB_FILE=/etc/glexec/lcmaps/lcmaps-suexec.db
LCMAPS_LOG_FILE=/var/log/glexec/lcas_lcmaps.log
LCAS_LOG_LEVEL=1
LCAS_DEBUG_LEVEL=0
LCAS_DB_FILE=/etc/glexec/lcas/lcas-suexec.db
LCAS_LOG_FILE=/var/log/glexec/lcas_lcmaps.log
JOB_REPOSITORY_ID=2011-07-19.16:35:00-22437
LCMAPS_VERIFY_TYPE=uid_pgid
LCMAPS_POLICY_NAME=glexec_get_account
LCMAPS_LOG_STRING=2011-07-19.16:35:00-22437-glexec_get_accounts
LCAS_LOG_STRING=2011-07-19.16:35:00-22437-glexec_get_accounts

I believe newer versions of glexec also set LOGNAME, USER, and X509_USER_PROXY.

The only problem that I can think of if any of these environment variables were overwritten by variables set by Condor or by the job ClassAd, is that nested invocations of glexec would not overwrite the LCAS/LCMAPS/JOB_REPOSITORY_ID variables. After the first invocation, these variables would all be frozen.


2011-Aug-18 15:01:26 by danb:
I talked with both Jim Kupsch and Zach. They agreed to the plan. In the stable series (7.6.4), we make a special case to preserve the job's PATH. In the development series (7.7.2), whenever there is any conflict, job environment trumps environment inherited from glexec.


2011-Aug-22 12:01:26 by zmiller:
Didn't get to the review yet, will have it done by this wednesday. Changing due date.


2011-Aug-29 15:58:53 by zmiller:
CODE REVIEW LOOKS GOOD, RESOLVING
[Append remarks]

Properties:

Type: defect           Last Change: 2011-Aug-29 15:59
Status: resolved          Created: 2011-Apr-26 16:04
Fixed Version: v070604           Broken Version: v070600 
Priority:          Subsystem: Unknown 
Assigned To: zmiller           Derived From:  
Creator: burt  Rust:  
Customer Group: cms  Visibility: public 
Notify: sfiligoi@fnal.gov, burt@fnal.gov,dan@hep.wisc.edu  Due Date: 20110824 

Related Check-ins:

2011-Aug-18 14:48   Check-in [26871]: Documented fix for PATH getting cleared by glexec. ===GT=== #2096 (By Dan Bradley )
2011-Aug-18 14:41   Check-in [26869]: Job environment now trumps glexec environment. ===GT:Fixed=== #2096 ===VersionHistory=== All Complete (By Dan Bradley )
2011-Aug-18 14:41   Check-in [26870]: Documented fix for job environment trumping glexec environment. #2096 (By Dan Bradley )
2011-Aug-18 12:47   Check-in [26855]: Prevent glexec from overriding the job's PATH environment variable. ===GT:Fixed=== #2096 ===VersionHistory=== All Complete (By Dan Bradley )