Page History

Turn Off History

Why are my vm universe VMware jobs failing and being put on hold?

Strange behavior has been noted when HTCondor tries to run a vm universe VMware job using a path to a VMX that contains a symbolic link. An example of an error message that may appear in such a job’s user log:

Error from starter on master_vmuniverse_strtd@nostos.cs.wisc.edu: register(/scratch/gquinn/condor/git/CONDOR_SRC/src/condor_tests/31426/31426vmuniverse/execute/dir_31534/vmN3hylp_condor.vmx) = 1/Error: Command failed: A file was not found/(ERROR)

Can't create snapshot for vm(/scratch/gquinn/condor/git/CONDOR_SRC/src/condor_tests/31426/31426vmuniverse/execute/dir_31534/vmN3hylp_condor.vmx)

To work around this problem:

If using file transfer (the submit description file contains vmware_should_transfer_files = true), then modify any configuration variable EXECUTE values on all execute machines, such that they do not contain symbolic link path components.

If using a shared file system, ensure that the submit description file command vmware_dirdoes not use symbolic link path name components.