| The VAX C and DEC C preprocessors are different.
The VAX C preprocessor was more character oriented. It
predated the ANSI/ISO C standard, which required a "token"
oriented preprocessor, the tokens being the lexical tokens of
the C language. This is what DEC C implements.
A character oriented preprocessor like the VAX C preprocessor
was sometimes suitable for preprocessing text that was not a C
program. A token oriented preprocessor that conforms to the C
standard is never suitable for preprocessing anything other
than C programs. [Some readers of this conference may
disagree.]
The DEC C preprocessor sometimes introduces whitespace between
tokens (r9 and .99r99 here) to keep from splicing them
together. For example, VAX C will preprocess A/**/B into AB,
DEC C into A B.
So my take on this is:
> Is this a bug?
The compiler team will consdier whether they really need to
separate identifier tokens from preprocessor number tokens and
will get back to you on that. :-) While the extra space
doesn't seem to change the behavior of any C programs, it also
doesn't seem necessary in the contexts that I've thought of.
> if not , will he be able to use in the future standard=vaxc?
Yes, with /STANDARD=VAXC the DEC C preprocessor will revert
back to the pre-standard VAX C behavior. I know of no plans
to stop supporting /STANDARD=VAXC in the future.
Dan
|
| Yeah, it was a bug. It is now fixed in our development sources. Here's
more info if you're interested.
In an ideal world, the following two will be equivalent (in UNIX parlance)
cc foo.c
cc -E foo.c > bar.c ; cc bar.c
In other words, generating an explicit preprocessor output file, then
feeding that output file back into the compiler will not change what gets
compiled.
This can be hard to achieve in C/C++ because of certain peculiatities of
the lexer and preprocessor. For example, consider the following
i = 0;
#define f(x) +
f(x)f(x)i;
If compiled without producing an output file, this should be equivalent to
i = 0;
+ + i;
which will evaluate to 0. That is, the 2 plus tokens are distinct, unary
plus operators.
If, however, the preprocessor produces an explicit output file, it could
look like
i = 0;
++i;
If this is then fed back into a compiler, the lexer will produce a single
++ preincrement operator. It will evaluate to 1, not 0 like before.
Internally, we call this "de-facto token-pasting". Tokens that should
remain distinct get "pasted" together when written to an output file then
fed back to the compiler.
To prevent this, the compiler (in certain modes) tries to insert space
characters between tokens that it believes could be "pasted" in this way.
For your example, the compiler thought that r9 and .99r99 could be lexed
differently if fed back into the compiler. They can't be. It didn't need
to insert the space and doesn't now.
BTW, this extra space should not have appeared in common mode or vaxc mode,
just ANSI modes. Have you tired common mode? It is often more appropriate
for text processing.
Hope this helps.
\John
|
| RE .1:
> I know of no plans to stop supporting /STANDARD=VAXC in the future.
Right, there are no plans to stop supporting /STANDARD=VAXC in the future.
As JP mentioned in .2, /STANDARD=COMMON is also a good choice for
doing this kind of text processing. It's not that /STANDARD=VAXC
would go away in DEC C, just that /STANDARD=COMMON represents a
set of behaviors that is more likely to be implemented on more
platforms than /STANDARD=VAXC. For the most part, the two modes
are very similar when it comes to preprocessing. The only difference
that comes to mind (other than the predefined macros describing the
language dialect mode) is that VAXC does recursive macro substitution.
|