[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DEC C Problem Reporting Forum |
Notice: | Report DEC C++ problems in TURRIS::C_PLUS_PLUS |
Moderator: | CXXC::REPETE TCHEON |
|
Created: | Fri Nov 13 1992 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 1299 |
Total number of notes: | 6249 |
1291.0. "When isdigit() not a digit? When it is �." by CARDHU::HILL (Mike Hill, Zuerich Switzerland) Wed Apr 30 1997 15:17
I have a problem with DECC on OpenVMS V6.2. The isdigit() macro does not
do what I expect in certain cases. The ctype.h definition of isdigit()
along with all other is*() routines does not mask off the high order bits
in certain cases (depends on various #defines).
One definition of is*() in ctype.h does this, the other does not.
The DEC C V5.5-003 compiler on OpenVMS Alpha V6.2-1H3 has this problem.
The program follows:
#include stdio
#include ctype
main()
{
printf("%d\n",isdigit('�'));
}
The decimal value of '�' is 201. Since this is a char, the C compiler uses
the value -55 (signed bytes).
The .I file shows:
main()
{
printf("%d\n",((decc$$gl___ctypea)?(*decc$$ga___ctypet)
[(int)('�')]&0x4:isdigit('�')));
} ^ ^
| |
+-------------+ |
| |
| To work, we need (('�')&0xff) here
|
+---------- or (unsigned char) here
Looking at ctype.h (taken from SYS$SHARE:DECC$RTLDEF.TLB) it looks like the
following definition is not taking signed chars into account:
# if ( ( defined(__DECC) || defined(__DECCXX) ) && !defined(__VAXC) )
# define __ISFUNCTION(c,p) \
(__ctypea?__ctypet[(int)(c)]&__##p:__IS##p(c))
# else
# define __ISFUNCTION(c,p) \
(__ctypea?__ctypet[(int)(c)]&__/**/p:__IS/**/p*
# endif
# define isdigit(c) __ISFUNCTION(c, DIGIT)
Instead of (c) in both lines above, I would like to see ((c)&0xff) or instead
of (int) I would hope for (unsigned char).
Looking again at ctype.h, the other definition of isdigit() looks like:
# define isdigit(c) (__ctype [(c) & 0xFF] & _D)
^
|
Note masking to fix -ve chars
On machines where bytes are signed, with this ctype.h, the program won't work
as I would like. On machines with unsigned bytes it will (or if the ctype.h
#defines pick the other definition).
I have read DECC_BUGS note 1125, and think that answer .2 implies that this is
expected, and the &0xff will be done for all cases. Maybe it was missed in
this specific case, or maybe it is intentional.
Another 'workaround' would be to have 128 dummy table entries before the
__ctypet[] table where all are set to zero. That would at least not require
the ctype.h to change.
Any input would be appreciated.
[Mike.Hill]
T.R | Title | User | Personal Name | Date | Lines |
---|
1291.1 | | SPECXN::DERAMO | Dan D'Eramo | Wed Apr 30 1997 15:51 | 33 |
| >The isdigit() macro does not do what I expect in certain cases.
If you want your code to be portable then you need to lower
your expectations. :-) As topic 1125 said, it is up to the
programmer to make sure the argument to isdigit is either EOF
or one of { 0, 1, 2, ..., UCHAR_MAX }.
>1125.1 If the argument has any other value, the behavior is
> undefined.
That means you must not use isdigit('�') in your program,
although you can use isdigit((unsigned char)'�').
One probably runs into this problem more often after
char *s;
int i;
by trying to use isdigit(*s) or isdigit(s[i]) despite the
possibility that *s or s[i] may not be in the domain of
isdigit().
The DEC C team can and from time to time does change <ctype.h>
or the functions declared in it to cover for this mistake in
the code. This papers over a real problem in the code that
could be uncovered again on the next system the code is ported
to. That other vendor might not be as responsive to customer
requests as Digital is.
You might try to compile with /UNSIGNED_CHAR until the program
can be changed or a new header file is released.
Dan
|
1291.2 | | TLE::D_SMITH | Duane Smith -- DEC C RTL | Wed Apr 30 1997 16:56 | 2 |
| One could also argue that signed characters are only useful to
represent 7-bit ascii characters which � is not.
|
1291.3 | "Undefined" can also return "expected results" | CARDHU::HILL | Mike Hill, Zuerich Switzerland | Fri May 02 1997 03:36 | 20 |
| I fully agree with .1, but want to point out that programs which worked
fine on previous versions of VMS will now fail in unexpected ways
because different versions of idigit() are being used.
This has caused customed dissatisfaction at an important customer, and
there is no need for this. Since it is undefined what will happen if
a character outside the allowed range is given - why not just return
the correct value? This is what all previous versions of VAXC and DECC
have done.
Either by changing the idigit() macro or copying the 128 entries covering
values 128-255 in front of the table.
It is difficult to sell this as a coding error (although that is what
it is) because compilers on other hardware the customer has don't have
this problem.
If it will help, I'll open an IPMT offering this as a suggestion.
[Mike.Hill]
|
1291.4 | | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Fri May 02 1997 09:31 | 7 |
| > It is difficult to sell this as a coding error (although that is what
> it is) because compilers on other hardware the customer has don't have
> this problem.
Most likely that's because the other hardware defaults to unsigned char.
Your customer might prefer to compile with /UNSIGNED_CHAR on VMS, to improve
compatibility with this other hardware.
|