The C preprocessor has always been a source of portability problems, mostly because numerous programmers took advantage of the way a particular processor handled something. A number of preprocessor constructs that are used frequently were never actually specified as part of the language; their use relies on knowledge of how the internals of the preprocessor work.
String Substitution
String substitution in preprocessor macros is one of these areas. Consider the following macro:
This difference depends on how macro parameters are expanded inside character strings. The ANSI standard specifies that the latter behavior is correct, and introduces a new syntax for achieving the former behavior:
The #value gets expanded to a quoted version of the parameter (e.g., "x"), and then the string concatenation rules take over to produce the desired result.
Character Constants
The rule that says preprocessor tokens are not replaced inside character strings also applies to character constants. A frequent construct in pre-ANSI C is:
#define CTRL(c) (037 & 'c')
This macro produces the control-character version of a regular character. Thus CTRL(L) would produce a CTRL-L. Unfortunately, in ANSI C, this will not work. The simplest way to avoid this problem is to define the macro slightly differently:
#define CTRL(c) (037 & c)
This macro is then called as CTRL('L').
Token Pasting
One of the features of some preprocessors is that they allow “token pasting.” This has never been a documented behavior, but is used frequently. With a token-pasting preprocessor, there are at least two ways to combine two tokens:
#define self(a) a
#define glue(a,b) a/**/b
self(x)1
glue(x,1)
Both of these are intended to produce a single token, x1. In ANSI C however, they both produce two separate tokens, x and 1.
The ANSI C standard defines a new syntax for token pasting:
#define glue(a, b) a ## b
Since ## is now a legitimate operator, programmers have much more freedom in the use of white space in both the definition and invocation of token pasting macros.
The #elif Directive
The ANSI C preprocessor now provides a directive #elif that may be used in conjunction with #ifdef and #endif.
The #error Directive
The ANSI C preprocessor provides #error, a directive that prints the error message given as an argument and then exits. This allows code of the form:
#if defined(BSD)
... BSD stuff ...
#elif defined(SYSV)
... System V stuff ...
#else
#error "One of BSD or SYSV must be defined."
#endif
Predefined Symbols
All preprocessors offer the predefined symbols _ _FILE_ _ (the current source file as a quoted string) and _ _LINE_ _ (the current line number as an integer). The ANSI C standard has added _ _DATE_ _ and _ _TIME_ _, which give the current date and time (as of when the program was compiled) as quoted strings.
The constant _ _STDC_ _ is defined as 1 in compilers that are compliant with ANSI C. This can be used to test whether or not ANSI C features may be used:
#ifdef _ _STDC_ _
... ANSI stuff ...
#else
... Non-ANSI stuff ...
#endif
Note
In the ANSI standard, the only defined value for _ _STDC_ _ is 1. If it is defined to any other value, the meaning is undefined. Unfortunately, the standard is somewhat ambiguous on this point.
This is a problem on SVR4, where AT&T uses _ _STDC_ _ with a value of 0 to enable certain ANSI C features outside of a strictly ANSI C-compliant environment. This means that the test above for an ANSI environment no longer works; it must be rewritten as:
#if _ _STDC_ _ == 1
... ANSI stuff ...
#else
... Non-ANSI stuff ...
#endif
Text After #else and #endif
Most preprocessors have always allowed constructs like:
#ifdef FOO
·
·
·
#else FOO
·
·
·
#endif FOO
However, this has never been strictly legal, since #else and #endif are not supposed to have arguments. In ANSI C this syntax is now expressly forbidden (although most compilers will just print a warning and accept it); it should be rewritten: