[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
Notice: | Welcome to the Digital UNIX Conference |
Moderator: | SMURF::DENHAM |
|
Created: | Thu Mar 16 1995 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 10068 |
Total number of notes: | 35879 |
9701.0. "MIPS-based C versus DEC C ?" by MEDINA::BEELEN (Kidolo n'dani ya p� yango) Fri May 02 1997 11:31
A customer is asking us to test his program (see below) on our Alpha DUNIX
machines. Can someone explain why I got much better results (more than a
factor of two) with the old MIPS-based compiler ( -oldc on DUNIX 4.0) than
with DEC C which is now the default.
I also compiled with the -check switch ... it seems that this program has
been written with few strictness. Could this be the reason?
Please advise.
William
P.S. The value 1000000 must be passed (as argument) to the program.
----------------------------------ctmul.c source file -------------------------
/* Program ctmul tests multipliction of vector with a constant /
Program ctmul za testiranje varijanti brzog mnozenja konstantom
*/
#include <stdio.h>
/*#include "vreme.h" */
#include <time.h>
/*#define BRS /*(P1_SIZE*P1_SIZE) /*0x1000 0x1000000*/
#define P1_SIZE 0x10000
#define P2_SIZE 0x10000
union IC {
unsigned long i;
unsigned char c[3];
unsigned short s[2];
} k13;
unsigned long mulc();
unsigned long tmdt[P1_SIZE], tmgt[P1_SIZE];
unsigned long tmd[P1_SIZE], tmg[P1_SIZE];
main(argc,argv)
int argc;
char *argv[];
{
unsigned long k11[0x100];
unsigned long tm0[0x100], tm1[0x100], tm2[0x100], tm3[0x100];
/* caltime t; */
unsigned long i, k10, kk, ip = 0;
unsigned long ii, brs;
unsigned char a = 0xaa, b = 0x55;
unsigned int al = 0xaaaaaaaa, bl = 0x55555555;
long ltimes, ltimef;
unsigned long nn_rep;
sscanf(*++argv,"%08lX",&nn_rep);
printf("Added by Djordje: nn_rep=%d [dec]\n",nn_rep);
time( <imes);
printf("START OF PROGRAM %08lX [hex]\n -> times: %lx\n", nn_rep, ltimes);
for (ii = 0; ii < P1_SIZE; ii++)
tmd[ii] = mulc(ii);
for (i = 0; i < P1_SIZE; i++) {
ii = i << 8;
ii <<= 8;
tmg[i] = mulc(ii) - 1;
}
printf("VECTORS LODADED./PUNJENJE VEKTORA GOTOVO.\n");
for (ii = 0; ii < 0x100; ii++) {
tm3[ii] = mulc(ii);
/* printf("%02x %08x ", ii, tm3[ ii]); */
i = ii << 8;
tm2[ii] = mulc(i) - 1;
/* printf("%02x %08x %08x ", ii, i, tm2[ ii]); */
i <<= 8;
tm1[ii] = mulc(i) - 1;
/* printf("%02x %08x %08x ", ii, i, tm1[ ii]); */
i <<= 8;
tm0[ii] = mulc(i) - 1;
/* printf("%02x %08x %08x \n", ii, i, tm0[ ii]); */
}
/* for( ii=0; ii< 0x100; ii++) printf( "%08x ", tmd[ ii]); printf( "
\n");
for( ii= 0; ii< 0x100; ii++) printf( "%08x ", tmg[ ii]); printf( " \n"); */
/* start_time(); */
printf("VECTORS LOADED - B./PUNJENJE VEKTORA GOTOVO - B.\n");
for (ii = 0; ii < nn_rep; ii++) {
k13.i = ii;
/* k13.s[0]=ii / P1_SIZE; k13.s[1]=(unsigned int) (ii % P1_SIZE);
if (k13.s[1]==0) printf("K13.S[0]= %04X\n",k13.s[0]);*/
k11[0] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[1] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[2] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[3] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[4] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[5] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[6] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[7] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[8] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[9] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xa] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xb] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xc] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xd] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xe] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[1] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[2] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[3] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[4] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[5] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[6] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[7] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[8] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[9] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xa] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xb] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xc] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xd] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0xe] = tmd[k13.s[1]] + tmg[k13.s[0]];
k11[0] = tmd[k13.s[1]] + tmg[k13.s[0]];
}
/* end_time();
t = time_interval(); */
printf("0a %x %x \n", ii, k11[0]);
/* start_time(); */
printf("VECTORS LOADED-C./PUNJENJE VEKTORA GOTOVO- C.\n");
for (ii = 0; ii < nn_rep; ii++) {
k13.i = ii;
k11[0] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[1] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[2] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[3] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[4] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[5] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[6] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[7] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[8] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[9] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xa] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xb] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xc] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xd] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xe] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[1] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[2] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[3] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[4] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[5] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[6] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[7] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[8] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[9] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xa] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xb] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xc] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xd] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0xe] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
k11[0] = tm0[k13.c[0]] + tm1[k13.c[1]] +
tm2[k13.c[2]] + tm3[k13.c[3]];
/* printf("%08x %02x %08x ", k13.i, k13.c[ 0], tm0[ k13.c[ 0]]);
printf("%02x %08x ", k13.c[ 1], tm1[ k13.c[ 1]]); printf("%02x %08x ",
k13.c[ 2], tm2[ k13.c[ 2]]); printf("%02x %08x ", k13.c[ 3], tm3[
k13.c[ 3]]); printf("%08x %08x \n", k11[ 0], mulc( ii)); */
}
/* end_time();
t = time_interval(); */
printf("0b %x %x \n", ii, k11[0]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k13.i = ii;
k11[0] = tm0[k13.c[0]] ;
k11[1] = tm0[k13.c[0]];
k11[2] = tm0[k13.c[0]] ;
k11[3] = tm0[k13.c[0]] ;
k11[4] = tm0[k13.c[0]] ;
k11[5] = tm0[k13.c[0]] ;
k11[6] = tm0[k13.c[0]] ;
k11[7] = tm0[k13.c[0]] ;
k11[8] = tm0[k13.c[0]] ;
k11[9] = tm0[k13.c[0]] ;
k11[0xa] = tm0[k13.c[0]] ;
k11[0xb] = tm0[k13.c[0]] ;
k11[0xc] = tm0[k13.c[0]];
k11[0xd] = tm0[k13.c[0]] ;
k11[0xe] = tm0[k13.c[0]] ;
k11[0] = tm0[k13.c[0]] ;
k11[0] = tm0[k13.c[0]] ;
k11[1] = tm0[k13.c[0]] ;
k11[2] = tm0[k13.c[0]] ;
k11[3] = tm0[k13.c[0]] ;
k11[4] = tm0[k13.c[0]] ;
k11[5] = tm0[k13.c[0]] ;
k11[6] = tm0[k13.c[0]] ;
k11[7] = tm0[k13.c[0]] ;
k11[8] = tm0[k13.c[0]] ;
k11[9] = tm0[k13.c[0]] ;
k11[0xa] = tm0[k13.c[0]] ;
k11[0xb] = tm0[k13.c[0]] ;
k11[0xc] = tm0[k13.c[0]] ;
k11[0xd] = tm0[k13.c[0]] ;
k11[0xe] = tm0[k13.c[0]] ;
k11[0] = tm0[k13.c[0]] ;
/* printf("%08x %02x %08x ", k13.i, k13.c[ 0], tm0[ k13.c[ 0]]);
printf("%02x %08x ", k13.c[ 1], tm1[ k13.c[ 1]]); printf("%02x %08x ",
k13.c[ 2], tm2[ k13.c[ 2]]); printf("%02x %08x ", k13.c[ 3], tm3[
k13.c[ 3]]); printf("%08x %08x \n", k11[ 0], mulc( ii)); */
}
/* end_time();
t = time_interval(); */
printf("0c %x %x \n", ii, k11[0]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[1& 0xff] = ii * 0x8088405 + 1 ;
k11[1] = ii * 0x8088405 + 1 ;
}
/* end_time();
t = time_interval(); */
printf("1 %x %x \n", ii, k11[1]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
i = ii << 15;
k11[0] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[1] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[3] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[4] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[5] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[6] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[7] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[8] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[9] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xa] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xb] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xc] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xd] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xe] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xf] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[1] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[3] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[4] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[5] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[6] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[7] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[8] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[9] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xa] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xb] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xc] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xd] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[0xe] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
k11[2] = i + 1 + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
}
/* end_time();
t = time_interval(); */
printf("2 %x %x \n", ii, k11[2]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
i = ii << 15;
k11[3] = (i ^ 1) + (i << 4) + (i << 12) + ii + (ii << 2) + (ii << 10);
}
/* end_time();
t = time_interval(); */
printf("3 %x %x \n", ii, k11[3]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k10 = ii + 1;
k10 += (ii << 2);
k10 += (ii << 10);
i = (ii << 15);
k10 += i;
k10 += (i << 4);
k11[4] = ((i << 12) + k10);
}
/* end_time();
t = time_interval(); */
printf("4 %x %x \n", ii, k11[4]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k10 = ii + 1;
i = ii << 2;
k10 += i;
i = i << 8;
k10 += i;
i = i << 5;
k10 += i;
i = i << 4;
k10 += i;
i = i << 8;
k11[5] = (i + k10);
}
/* end_time();
t = time_interval(); */
printf("5 %x %x \n", ii, k11[5]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[6] = mulc(ii);
}
/* end_time();
t = time_interval(); */
printf("6 %x %x \n", ii, k11[6]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[7^ii& 0xff] = ii >> 24;
k11[7] = ii >> 14;
}
/* end_time();
t = time_interval(); */
printf("7 %lx %lx \n", ii, k11[7]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[0^ ii& 0xff] = ii >> 11;
k11[0] = k11[7] >> 11;
k11[1^ ii& 0xff] = ii >> 11;
k11[1] = k11[7] >> 11;
k11[2^ ii& 0xff] = ii >> 11;
k11[2] = k11[7] >> 11;
k11[3^ ii& 0xff] = ii >> 11;
k11[3] = k11[7] >> 11;
k11[4^ ii& 0xff] = ii >> 11;
k11[4] = k11[7] >> 11;
k11[5^ ii& 0xff] = ii >> 11;
k11[5] = k11[7] >> 11;
k11[6^ ii& 0xff] = ii >> 11;
k11[6] = k11[7] >> 11;
k11[7] = ii >> 11;
k11[7] = k11[7] >> 11;
}
/* end_time();
t = time_interval(); */
printf("8 %x %x \n", ii, k11[7]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[7 & ii & 0xff] = ii ^ 1;
k11[7] = ii ^ 1;
}
/* end_time();
t = time_interval(); */
printf("9 %x %x \n", ii, k11[7]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[0] = a ^ b ^ ii;
k11[0] = b ^ ii ^ a;
k11[1] = a ^ b^ ii;
k11[1] = b ^ a^ ii;
k11[2] = a ^ b^ ii;
k11[2] = b ^ a^ ii;
k11[3] = a ^ b^ ii;
k11[3] = b ^ a^ ii;
k11[4] = a ^ b^ ii;
k11[4] = b ^ a^ ii;
k11[5] = a ^ b^ ii;
k11[5] = b ^ a^ ii;
k11[6] = a ^ b^ ii;
k11[6] = b ^ a^ ii;
k11[7] = a ^ b^ ii;
k11[7] = b ^ a^ ii;
}
/* end_time();
t = time_interval(); */
printf("a %x %x \n", ii, k11[7]);
/* start_time(); */
for (ii = 0; ii < nn_rep; ii++) {
k11[0] = al ^ bl;
k11[0] = bl ^ al;
k11[1] = al ^ bl;
k11[1] = bl ^ al;
k11[2] = al ^ bl;
k11[2] = bl ^ al;
k11[3] = al ^ bl;
k11[3] = bl ^ al;
k11[4] = al ^ bl;
k11[4] = bl ^ al;
k11[5] = al ^ bl;
k11[5] = bl ^ al;
k11[6] = al ^ bl;
k11[6] = bl ^ al;
k11[7] = al ^ bl;
k11[7] = bl ^ al;
}
/* end_time();
t = time_interval(); */
printf("b %x %x \n", ii, k11[7]);
time( <imef);
printf("timef: %lx \n", ltimef);
printf("time: %ld \n", ltimef- ltimes);
/*printf("time: %lx \n", ); */
}
unsigned long mulc(unsigned long inp)
{
unsigned long i, k10;
i = inp << 2;
k10 = i + 1;
k10 += inp;
i = i << 8;
k10 += i;
i = i << 5;
k10 += i;
i = i << 4;
k10 += i;
i = i << 8;
return (i + k10);
}
T.R | Title | User | Personal Name | Date | Lines |
---|
9701.1 | | DECCXL::MARIO | | Fri May 02 1997 12:44 | 18 |
| Yes, there is a big difference. With the DECC compilers that we've shipped
in V4.0 - V4.0c, the DECC compiler is 26% slower than the ACC MIPS compiler.
I tested it against the DECC compiler that we'll be shipping for V4.0D (PTmin)
and both DECC and ACC produce equivalent runtimes.
I also tried it against the compiler that we'll be dropping into V4.2 (steel)
and in that case the DECC compiler is 36% faster than the ACC compiler.
In V4.0, when we replaced the ACC compiler we tried to close any performance
gaps where ACC produced better code. This one must have slipped through.
It's interesting that where ACC is faster, the .text size produced
by the ACC compiler is much larger than by DECC (4976 vs 3728 bytes)
We'll look closer at this offline to see exactly where the difference is
coming from.
Joe
|
9701.2 | One analysis | DECCXL::MARIO | | Fri May 02 1997 15:14 | 50 |
| From: [email protected]
To: [email protected]
cc: mdavis
Subject: Re: Mark, do you have any ideas on this?
Joe,
This is yet-another-example of a poorly designed benchmark. There
are 11 different loops, and for almost every one, if only the LAST
iteration is executed, you will get the same program results. Furthermore,
for many of the loops, the loop body is duplicated making 2 copies: if you
compile only the SECOND half of the loop body, you'll get the same result.
So, this "benchmark" is really a test of how much useless benchmark
code a compiler can throw away.
NO REAL PROGRAM would look like this.
If k11 were declared "volatile", then all compilers would have to
perform most of the operations - but it wouldn't be very interesting - it's
like compiling at -O0.
Another way of making it more realistic (or force all the
iterations) is to replace "k11[0] = ..." by "k11[0] += ..." so all the
iterations contribute to the final array result; then you need to make use
of all the array elements as (potential) outputs, like adding them all
together, and printing that result.
Plus it's not legal ANSI, because some of the loops do:
union IC {
unsigned long i;
unsigned char c[3];
unsigned short s[2];
} k13;
...
k13.i = ii;
k11[0] = tmd[k13.s[1]] + tmg[k13.s[0]];
It's undefined if you store into memory using type1, and load out
of that memory using type2 (unless you use char).
The benchmark has the legal computation commented out:
/* k13.s[0]=ii / P1_SIZE; k13.s[1]=(unsigned int) (ii % P1_SIZE);
if (k13.s[1]==0) printf("K13.S[0]= %04X\n",k13.s[0]);*/
You may post this....
Mark
|