{section: Condor's Imake Build System Coding Guidlines and Conventions}
Revised: Tue Jun 22 14:24:56 CDT 2004
{section: Background}
I, Pete Keller, am writing this in May 2004. The Imake system has been
a painful thorn in the side of Condor since anyone could remember. This
document describes the work which I have done and a new coding style for
Imakefiles et. al. which will change our Imake system from a terrible
collection of unrewarding migraines into manageable collection of rewarding
migraines. After performing all of this work, I have come to the realization
that eventually GNU M4 should replace imake as the macro processor in our
build system. Imake's initial designers had made a terrible choice in using
a preprocessor for a nonwhitespace sensitive language, i.e., C, for a
whitespace sensitive language, i.e, Makefiles. Currently, things *should*
work for a good many number of ports, but the headaches of "keeping everything
just right" might get to much, and the evolution will continue.
The biggest problem with using imake seemed to be a fundamental lack of
understanding for how it was supposed to work in the first place. Poorly
implemented rules which "worked" on old cpps were simply copied around
and their implementation became canon. Until it stopped working.
This situation happened because GCC's cpp started conforming to the rules
that they had avoided for so long. However, many OSes vendor cpps tread
into undefined territory, so I worked hard to find a common ground,
the stuff that all cpps are supposed to implement.
The main culprits for the failure of our Imake system to be resilient and
robust comes down to two systemic problems. Serious misuse of whitespace
and the '##' concatenation operator. These two problems alone caused
the most headaches while trying to port Condor to a new compiler which
happens often and will, if it had been left alone, cause problems for
the rest of eternity.
{section: Guidelines and Rules for the Use of Imake with Condor}
{subsection: The CPP}
The build system has been changed to use the same cpp that comes
with the distribution of the target OS. This would be the same
cpp that the X Consortium source code would need to compile. On
most machines it is /lib/cpp, or even /usr/lib/cpp, etc. It
doesn't matter the revision of the cpp since the rest of the
guildlines explain how to write generalized preprocessor macros.
There is a new script, ansi_cpp, in the imake/ directory which figures
out which cpp to use for each architecture. On some architectures,
we are forced to use the vendor compiler, and so that must be installed.
On other architectures, there are a lot of cpp programs to choose from
and we must choose the right one. At all times, I've tried to stay
canon to what the X Consortium code does to compile X.
{subsection: TAB characters}
TAB characters passed through cpp and then imake are preserved in all
cases except an errant gcc 3.1 revision. This errant version was fixed
by hacking our imake code to invoke the cpp differently.
{subsection: #endif Comments}
Comments on the same line as the #endif, but after it, must be C style
comments, not bare text.
{subsection: #define [thing] [stuff]}
[thing] MUST be a true C Identifier:
_CORRECT_:
=#define IS_I386_LINUX_RH9 YES=
_INCORRECT_:
=#define IS_I386_LINUX-RH9 YES=
{subsection: Generated Makefile Comments}
If you'd like comments in the generated Makefile, please use XCOMM in
the Imakefile like this:
_XCOMM This is a comment in the generated Makefile_
XCOMM is an imake program built in.
{subsection: Macro Names}
Macro names must not unintentionally appear in any XCOMM comments as
they can be possibly expanded by various revisions of cpps.
{subsection: Macro Definitions}
There can be no whitespace in the parameter list of a macro
definition. Also, no whitespace around the interior of the
parenthesis, which is a common idiom in Condor C/C++ code.
Here is an example:
_CORRECT_: =#define Concat(x,y)x##y=
_INCORRECT_: =#define Concat( x, y )x##y=
In addition, there should be no spaces after the closing parenthesis
and the start of the nonwhitespace text--especially in a one
line macro rule:
_CORRECT_: =#define copy(x,y)cp x y=
_INCORRECT_: =#define copy(x,y) cp x y=
If you have a multiline rule where you use @@\ to end the line(but
continue the rule to the next line),
any whitespace from the @@\ to the previous nonwhitespace character
on that line is undefined for its preservation. Do not rely
on the existance of that particular sort of whitespace. In practice,
however, no rules rely on this and it would be out of the ordinary
if a rule does.
{subsection: Rule Invocation in an Imakefile}
There must be NO WHITESPACE around the open/closing parenthesis and
commas in rule invocation since this whitespace is undefined in its
preservation between cpps. However, whitespace could exist in the
invocation rule if it is separating items within a parameter:
_CORRECT_:
=program_target(thingy,$(OBJS) foo.o bar.o,$(LIBS))=
Notice the whitespace in the second parameter, whitespace of this
specific kind is valid.
_INCORRECT_:
=program_target( thingy, $(OBJS) foo.o bar.o, $(LIBS) )=
In addition, you may NOT use the line continuation character:
='\'= anywhere in the line when invoking a rule:
_INCORRECT_:
{code}program_target(thingy,$(OBJS) foo.o bar.o,\
$(LIBS))
{endcode}
Different pre-processors preserve or don't preserve the whitespace
up to the =$(LIBS)= lexeme and that can cause trouble
depending how your rules are layed out. Since it COULD be possible to
use the ='\'= character in some places and not others validly,
I'm striking out the use of it altogether so we don't have to
worry about the details of when you can or can't use it.
{subsection: Use of the concatenation operator: ##}
This operator is the main reason for the problems we've experienced
using the Imake system. People had been using ##
incorrectly in the sense that they were compressing
out whitespace preserved by cpp during incorrect Rule
Invocations but, in doing so, produced invalid C identifier
tokens with respect to the usage of ## according to the
specifications in the ANSI C manual.
Since using ## to produce a non-C token was "undefined"
according to the specification, many, but not all, cpp
programmers honored the pasting together of two lexemes
for the reason of whitespace removal. But ultimately,
it was bad behavior and since the specification has
been tightened to reject arbitrary lexeme pastings by
many modern cpp versions.
If you do need to use ##, then use one of the =Concat*()=
rules which exist for this purpose that are found
in Global.h. These rules embody special behavior for
certain cpp programs where ## can be replaced with /**/
because ## wasn't supported correctly or even available
at all in the cpp.
Since we now FORCE, by the creation of this document, that ANY and
ALL rule invocations must not have any extraneous whitespace, we can
consign the use of ## back to where it is supposed to be, in the
=Concat*()= rules.
This is the _ONLY VALID USE _
of the ## operator:
c_identifier1##c_identifer2 -> c_identifier1c_identifier2
These are examples of _INCORRECT_ uses of the
## operator:
{code}
=path##/##to##/##some##/##dir##=
=directory##.static=
=##make_target##: stuff things other=
=perl condor_scripts/make_both_tarball -cmd "$(TAR_CMD)" strip##, name##, files##, contribfiles=
=DEPEND_C_SRC := $(##obj_list##:.o=.c)=
{endcode}
Now, here is the important thing:
*YOU WILL NEVER NEED TO USE THE ## OPERATOR EVER! USE ONE OF THE
=Concat*()= RULES INSTEAD AND ONLY WHEN PRODUCING A TRUE
C IDENTIFIER.*
{subsection: =#= and Preprocessor Directives }
When using preprocessor directives like =#define=, etc,
please be sure to place the =#= character in the
FIRST column of the text window. However, there can be
whitespace between the hash mark and the preprocessor
directive like this:
{code}
# define FOO BAR
{endcode}
{section: Conventions in the Imake Build Files}
{subsection: Macro Namespace}
The name of a macro can only be used during the invocation of a macro.
For example, suppose we have a macro called
=BUILD(x,y)=. Now in an Imakefile rule, we have a
cleanup rule for a target which looks like this, =rm
-rf BUILD=--which is completely seperate from the
invocation of the macro rule =BUILD(x,y)=. In
this case the cleanup rule is an illegal use of the
macro =BUILD(x,y)=.
In short, do not mix the namespaces of the macro names and the
entities the generated Makefile will manage.
{subsection: Code Reuse}
Suppose in a rule you'd like to have _static appended to
a directory name in multiple places or something
similar. Instead of using Concat(name,_static) everywhere,
write a simple rule like this (toward the beginning of
the Imake.rules file):
=#define sufstatic(x)Concat(x,_static)=
And now whenever you want foo_static, you write sufstatic(foo) instead.
This will greatly help in producing maintainable and defensive code.
{subsection: Writing New Imake Rules}
New rules should be written in this sort of style:
{code}
#ifndef release_target
#define release_target(file,dir,mode) @@\
XCOMM Begin translation of func(release_target) @@\
$(RELEASE_DIR)/dir/file: file @@\
/bin/rm -f $(RELEASE_DIR)/dir/file @@\
cp file $(RELEASE_DIR)/dir @@\
chmod mode $(RELEASE_DIR)/dir/file @@\
release:: $(RELEASE_DIR)/dir/file @@\
XCOMM End translation of func(release_target)
#endif /* release_target */
{endcode}
Notice the XCOMM comments dictating that we are beginning
and ending the translation of this particular function. The
macro =func()= is defined in Imake.rules which
expands(even in the comment) the given function name into a nice piece
of text in the Makefile which describes the line of translation of the
associated function in the original Imakefile. While these are not
required for the rule to function, I *STRONGLY
ENCOURAGE* you to follow it. Obviously, for very simple rules
like the =sufstatic()= rule above, you don't need it, but
for anything more complicated you should strive to put it in.
{subsection: Converting older style Imake rules to New Imake Rules}
If you see any function (simple or complex)
that does not follow the conventions outlined in this document then
update the function to follow the guidlines. This is more of
a concern while merging as these changes propogate through the various
branches. But if you are editing a source file which
looks like it is adhering to these conventions, then assume it is
adhering to these conventions and change it accordingly.
{subsection: The Imake processor is the C Preprocessor:}
The usual caveats to this applies, e.g.:
If you need to do something like this:
=#define cat(x,y)x##y=
and you want to use it like this: =cat(cat(1,2),3)= it will
do the wrong thing and produce =cat(1,2)3=. If you want to do
it right, then you'd need an additional rule like this:
=#define xcat(x,y)cat(x,y)=
which would then produce the correct concatenation sequence.
Remember that section A.12 in the K&R C book applies.
In the case of the Condor build system,
_this specific example is already implemented_
in the =Concat*()= rules written in
=Global.h=.