[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::decc_bugs

Title:	DEC C Problem Reporting Forum
Notice:	Report DEC C++ problems in TURRIS::C_PLUS_PLUS
Moderator:	CXXC::REPETETCHEON

Created:	Fri Nov 13 1992
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	1299
Total number of notes:	6249

1291.0. "When isdigit() not a digit? When it is �." by CARDHU::HILL (Mike Hill, Zuerich Switzerland) Wed Apr 30 1997 14:17

I have a problem with DECC on OpenVMS V6.2.  The isdigit() macro does not
do what I expect in certain cases.  The ctype.h definition of isdigit()
along with all other is*() routines does not mask off the high order bits
in certain cases (depends on various #defines).

One definition of is*() in ctype.h does this, the other does not.

The DEC C V5.5-003 compiler on OpenVMS Alpha V6.2-1H3 has this problem.

The program follows:

	#include stdio
	#include ctype
	main()
	{
	printf("%d\n",isdigit('�'));
	}

The decimal value of '�' is 201.  Since this is a char, the C compiler uses
the value -55 (signed bytes).

The .I file shows:

	main()
	{
	printf("%d\n",((decc$$gl___ctypea)?(*decc$$ga___ctypet)
	              [(int)('�')]&0x4:isdigit('�')));
	}                ^    ^
	                 |    |
	   +-------------+    |
	   |                  |
	   |     To work, we need (('�')&0xff) here
	   |
	   +---------- or (unsigned char) here

Looking at ctype.h (taken from SYS$SHARE:DECC$RTLDEF.TLB) it looks like the
following definition is not taking signed chars into account:

	#   if ( ( defined(__DECC) || defined(__DECCXX) ) && !defined(__VAXC) )
	#      define __ISFUNCTION(c,p)  \
	              (__ctypea?__ctypet[(int)(c)]&__##p:__IS##p(c))
	#   else
	#      define __ISFUNCTION(c,p)  \
	              (__ctypea?__ctypet[(int)(c)]&__/**/p:__IS/**/p*
	#   endif
	#   define isdigit(c)  __ISFUNCTION(c, DIGIT)

Instead of (c) in both lines above, I would like to see ((c)&0xff) or instead
of (int) I would hope for (unsigned char).

Looking again at ctype.h, the other definition of isdigit() looks like:

	#   define isdigit(c)  (__ctype [(c) & 0xFF] & _D)
	                                        ^
	                                        |
	                             Note masking to fix -ve chars

On machines where bytes are signed, with this ctype.h, the program won't work
as I would like.  On machines with unsigned bytes it will (or if the ctype.h
#defines pick the other definition).

I have read DECC_BUGS note 1125, and think that answer .2 implies that this is
expected, and the &0xff will be done for all cases.  Maybe it was missed in
this specific case, or maybe it is intentional.

Another 'workaround' would be to have 128 dummy table entries before the
__ctypet[] table where all are set to zero.  That would at least not require
the ctype.h to change.

Any input would be appreciated.

[Mike.Hill]

T.R	Title	User	Personal Name	Date	Lines
1291.1		SPECXN::DERAMO	Dan D'Eramo	`Wed Apr 30 1997 14:51`	33
	>The isdigit() macro does not do what I expect in certain cases. If you want your code to be portable then you need to lower your expectations. :-) As topic 1125 said, it is up to the programmer to make sure the argument to isdigit is either EOF or one of { 0, 1, 2, ..., UCHAR_MAX }. >1125.1 If the argument has any other value, the behavior is > undefined. That means you must not use isdigit('�') in your program, although you can use isdigit((unsigned char)'�'). One probably runs into this problem more often after char s; int i; by trying to use isdigit(s) or isdigit(s[i]) despite the possibility that *s or s[i] may not be in the domain of isdigit(). The DEC C team can and from time to time does change <ctype.h> or the functions declared in it to cover for this mistake in the code. This papers over a real problem in the code that could be uncovered again on the next system the code is ported to. That other vendor might not be as responsive to customer requests as Digital is. You might try to compile with /UNSIGNED_CHAR until the program can be changed or a new header file is released. Dan
1291.2		TLE::D_SMITH	Duane Smith -- DEC C RTL	`Wed Apr 30 1997 15:56`	2
	One could also argue that signed characters are only useful to represent 7-bit ascii characters which � is not.
1291.3	"Undefined" can also return "expected results"	CARDHU::HILL	Mike Hill, Zuerich Switzerland	`Fri May 02 1997 02:36`	20
	I fully agree with .1, but want to point out that programs which worked fine on previous versions of VMS will now fail in unexpected ways because different versions of idigit() are being used. This has caused customed dissatisfaction at an important customer, and there is no need for this. Since it is undefined what will happen if a character outside the allowed range is given - why not just return the correct value? This is what all previous versions of VAXC and DECC have done. Either by changing the idigit() macro or copying the 128 entries covering values 128-255 in front of the table. It is difficult to sell this as a coding error (although that is what it is) because compilers on other hardware the customer has don't have this problem. If it will help, I'll open an IPMT offering this as a suggestion. [Mike.Hill]
1291.4		WIBBIN::NOYCE	Pulling weeds, pickin' stones	`Fri May 02 1997 08:31`	7
	> It is difficult to sell this as a coding error (although that is what > it is) because compilers on other hardware the customer has don't have > this problem. Most likely that's because the other hardware defaults to unsigned char. Your customer might prefer to compile with /UNSIGNED_CHAR on VMS, to improve compatibility with this other hardware.