Page History
Standard Universe
Notes from Peter Keller's 2011-11-30 brain-dump.
libsyscall
The standard universe requires 'condorized' binaries, which are "statically-linked" binaries (see below for the standard universe definition thereof) created by condor_compile, which, despite the name, is a linker, rather than a compiler. It creates binaries with the API stack on the right.
normal condorized --------------- ------------------ | application | | application | | | | |------------| | | | | libsyscall | --------------- |------------| | | libc | | libc | | --------------- ------------------ | kernel | | kernel | --------------- ------------------
When starting a standard universe job, the standard universe -specific shadow establishes a connection with the startd. This connection is inherited by the standard universe -specific starter, which in turn bequeaths it to the application (specifically, libsyscall). The starter's connection is used for startup, shutdown, and suspend/resume. All other communication to the shadow comes from libsyscall.
There may be machinery in libsyscall to make it signal-atomic.
"Statically-Linked"
Because the OS-provided information on (in-memory) segment boundaries is frequently (but inconsistently) wrong, the checkpointer must guess them. Because it runs in the signal handler, it is impossibly tricky to handle segfaults; therefore, it must guess them correctly. To make this possible, a standard universe application must keep all its voilatile data on the stack or in the .data segment, and it must be laid out in memory as follows.
--------------- | environment | |-------------| <- guessed; may be an unmapped page here | the stack | | | |--\/\/\/\/\/-| <- guessed | | |-/\/\/\/\/\--| <- sbrk | .data | |-------------| | .bss&.bzero | |-------------| <- _data_start (linker symbol) | .text | ---------------
These general requirements have some specific consequences:
No calls to mmap().
Because the kernel lies about segment boundaries, we can't permit calls to mmap(). (With one exception: requests for anonymous segments can be satisfied by sbrk().) We can't just intercept calls to mmap() and record the results because ldd has its own mmap() implementation. This implies the following.
No dynamic linking.
Obvious, but also includes calls to dlopen(). This implies patches to Ulrich Drepper's glibc, because it likes to dlopen() libnss (to get different resolvers). The patches allows glibc to be built entirely statically, but necessitate our own copy.
The gcc and g++ runtimes must also be static.
This is implied by the above, but worth calling out because it can require you to recompile the compiler. Many distributions now have (optional) packages that include the static runtimes, instead.
"VM compat" layout.
As mentioned above, the application must be laid out in memory in a specific way; in particular, the .bss (and/or .bzero) segments must (directly) abut the .data segment. This used to always be true; hence the 'compat' in the personality name.
No VA randomization or ExecShield.
Both virtual address randomization and ExecShield (which used to be but no longer are the same thing, apparently) screw up the required in-memory layout and can not be used.
Consistent .vdso section address.
The .vdso segment -- used by Linux to speed system calls -- is not checkpointed (being kernel memory space) but must be mapped at the exact same address on restart.
RPCs