Is # permitted in an object-like macro, and if so, what happens?
The C standard only defines the behaviour of # in a macro for function-like macros.
Sample code:
#include <stdio.h>
#define A X#Y
#define B(X) #X
#define C(X) B(X)
int main()
{
printf(C(A) "\n");
}
gcc outputs X#Y, suggesting that it permits # to be present and performs no special processing. However, since the definition of the # operator does not define the behaviour in this case, is it actually undefined behaviour?
As you noticed, # only has a defined effect in function-like macros. § 6.10.3.2/1 (all references to the standard are to the C11 draft (N1570)). To see what happens in object-like macros, we must look elsewhere.
A preprocessing directive of the form
# define identifier replacement-list new-linedefines an object-like macro that causes each subsequent instance of the macro name to be replaced by the replacement list of preprocessing tokens that constitute the remainder of the directive. [...]
§ 6.10.3/9
Therefore, the only question is whether # is allowed in a replacement-list. If so, it takes part in replacement as usual.
We find the syntax in § 6.10/1:
replacement-list:
pp-tokens (opt.)
pp-tokens:
preprocessing-token
pp-tokens preprocessing-token
Now, is # a valid preprocessing-token? § 6.4/1 says:
preprocessing-token:
header-name
identifier
pp-number
character-constant
string-literal
punctuator
each non-white-space character that cannot be one of the above
It's certainly not a header-name (§ 6.4.7/1), it's not allowed in identifier tokens (§ 6.4.2.1/1), nor is it a pp-number (which is basically any number in an allowed format, § 6.4.8/1), nor a character-constant (such as u'c', § 6.4.4.4/1) or a string-literal (exactly what you'd expect, e.g. L"String", § 6.4.5/1).
However, it is listed as a punctuator in § 6.4.6/1. Therefore, it is allowed in the replacement-list of an object-like macro and will be copied verbatim. It is now subject to rescanning as described in § 6.10.3.4. Let us look at your example:
C(A) will be replaced with C(X#Y). # has no special effect here, because it is not in the replacement-list of C, but its argument. C(X#Y) is obviously turned into B(X#Y). Then B's argument is turned into a string literal via the # operator in B's replacement-list, yielding "X#Y"
Therefore, you don't have undefined behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With