| From: DEC:.REO.REOVTX::HUDSON "[email protected] - UK Software
Partner Engineering 830-4121" 12-MAR-1997 10:54:47.95
To: nm%vbormc::"[email protected]"
CC: HUDSON
Subj: RE:ESCALATION: POINT No22329, ROPRAND
Hello Fruehauf,
Thank-you for your ASAP call on ROPRAND within DECForms.
I haven't yet been able to find an answer to this problem, so I am still
looking, but there are a couple of things that I would like to suggest in case
you recognise them as being significant.
Dirty Zeros
===========
First, you don't say, but if this error is happening with data that came from a
VAX, then there is a possibility that it may be due to "dirty zeros". On
VMS, if you have a bit-value that is a "dirty zero", then a VAX processor will
treat that value as "zero", but an Alpha processor will give an exception if
you try and use the value. This is usually an "HPARITH" exception, but it
could be that the LIB-F-ROPRAND is a follow-on effect of that.
To explain what a "dirty zero" is: When you have a floating point value stored
on VMS, you will typically have F_FLOAT (single precision) or G_FLOAT or
D_FLOAT (double precision). For these data types, different bit fields are
used in the data to represent different parts of the floating number. For
example:
A F_FLOAT looks like this :
32 16|15 7|6 0
+----------------+-+--------+-------+
| fraction2 |S|exponent|fract1 |
+----------------+-+--------+-------+
32 16|15 7|6 0
To represent "0.0", all 32 bits of the value should be zero. But on a VAX, only
the sign (S above) and exponent have to be zero for the number to be treated as
"0.0". So you could have a non-zero value in "fraction2" and still have the
number be treated as "0.0".
When a mathematical operation takes place on a VAX that results in "0.0", it
will always be a clean zero (all bits = 0), but it may sometimes be possible
for data (in files for example) to have dirty zeros if it was generated by, for
example, another computer, or a program which deliberately put dirty zeros into
data.
Because a VAX doesn't complain about dirty zeros, I have seen cases in the past
when data files have been used on VAX with no problems that cause errors when
moved on to Alpha. In other words, the data was always bad, but you never had
a problem.
The way round this is to write a program on the VAX which reads in all the data
and writes it back out again. Because a VAX never generates dirty zeros, this
process will "clean" any zeros in data files.
The dirty zero effect could be relevant to you if you are using data that has
come from an old (VAX) system. If this is the case, then it is worth further
investigation.
Alignment
=========
The Alpha processor is sensitive to the alignment of data. For example it
likes a 32-bit data value to be located at an address which is 32-bit aligned.
If you have data which isn't aligned, then this usually just has the effect of
slowing down the application, but in some cases, it can cause "ROPRAND" errors.
For example the following C program would give a ROPRAND:
#include <stdio.h>
#include <stdlib.h>
#include <builtins.h>
main()
{
int *i;
int j;
i = malloc(sizeof(int)*2); /* i is 32-bit aligned */
*i = 0x12345678;
printf("*i is %x\n",*i);
j = (int)i;
j++;
i = (int *)j; /* now i is not aligned */
__ATOMIC_INCREMENT_LONG(i);
printf("*i is %x\n",*i);
}
The problem in this program is that the "__ATOMIC_INCREMENT_LONG" routine
expects that "i" is a 32-bit aligned pointer.
Routines such as "__ATOMIC_INCREMENT_LONG" may be used by code that is dealing
with data that is declared "volatile", such as data which is shared between
multiple code threads.
So perhaps you could get a ROPRAND error in cases where you have data which
isn't aligned but is volatile.
I don't know whether either of the above apply to your situation, but hopefully
you might be able to at least say "no, that definitely isn't relevant".
In the meantime, I am still looking into this.
Here are some other questions for you:
- Is there any chance of your supplying me with a backup saveset that would
reproduce the problem?
- Can you reproduce the problem outside the ACMS environment?
- Is this code that has worked before and just now stopped working? If
so, what is different?
Regards
Nick Hudson
Digital Software Partner Engineering.
|
| From: VBORMC::"[email protected]" "MAIL-11 Daemon" 13-MAR-1997 10:22:48.74
To: "[email protected] - UK Software Partner Engineering 830-4121
12-Mar-1997 1054 +0000" <[email protected]>
CC:
Subj: Re: ESCALATION: POINT No22329, ROPRAND
[email protected] - UK Software Partner Engineering 830-4121
12-Mar-1997 1054 +0000 wrote:
>
> Hello Fruehauf,
>
> Thank-you for your ASAP call on ROPRAND within DECForms.
>
> I haven't yet been able to find an answer to this problem, so I am still
> looking, but there are a couple of things that I would like to suggest in case
> you recognise them as being significant.
>
> Dirty Zeros
> ===========
>
> First, you don't say, but if this error is happening with data that came from a
> VAX, then there is a possibility that it may be due to "dirty zeros". On
> VMS, if you have a bit-value that is a "dirty zero", then a VAX processor will
> treat that value as "zero", but an Alpha processor will give an exception if
> you try and use the value. This is usually an "HPARITH" exception, but it
> could be that the LIB-F-ROPRAND is a follow-on effect of that.
>
> To explain what a "dirty zero" is: When you have a floating point value stored
> on VMS, you will typically have F_FLOAT (single precision) or G_FLOAT or
> D_FLOAT (double precision). For these data types, different bit fields are
> used in the data to represent different parts of the floating number. For
> example:
>
> A F_FLOAT looks like this :
>
> 32 16|15 7|6 0
> +----------------+-+--------+-------+
> | fraction2 |S|exponent|fract1 |
> +----------------+-+--------+-------+
> 32 16|15 7|6 0
>
> To represent "0.0", all 32 bits of the value should be zero. But on a VAX,
only
> the sign (S above) and exponent have to be zero for the number to be treated as
> "0.0". So you could have a non-zero value in "fraction2" and still have the
> number be treated as "0.0".
>
> When a mathematical operation takes place on a VAX that results in "0.0", it
> will always be a clean zero (all bits = 0), but it may sometimes be possible
> for data (in files for example) to have dirty zeros if it was generated by, for
> example, another computer, or a program which deliberately put dirty zeros into
> data.
>
> Because a VAX doesn't complain about dirty zeros, I have seen cases in the past
> when data files have been used on VAX with no problems that cause errors when
> moved on to Alpha. In other words, the data was always bad, but you never had
> a problem.
>
> The way round this is to write a program on the VAX which reads in all the data
> and writes it back out again. Because a VAX never generates dirty zeros, this
> process will "clean" any zeros in data files.
>
> The dirty zero effect could be relevant to you if you are using data that has
> come from an old (VAX) system. If this is the case, then it is worth further
> investigation.
>
> Alignment
> =========
>
> The Alpha processor is sensitive to the alignment of data. For example it
> likes a 32-bit data value to be located at an address which is 32-bit aligned.
> If you have data which isn't aligned, then this usually just has the effect of
> slowing down the application, but in some cases, it can cause "ROPRAND" errors.
>
> For example the following C program would give a ROPRAND:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <builtins.h>
>
> main()
> {
> int *i;
> int j;
>
> i = malloc(sizeof(int)*2); /* i is 32-bit aligned */
> *i = 0x12345678;
> printf("*i is %x\n",*i);
> j = (int)i;
> j++;
> i = (int *)j; /* now i is not aligned */
> __ATOMIC_INCREMENT_LONG(i);
> printf("*i is %x\n",*i);
> }
>
> The problem in this program is that the "__ATOMIC_INCREMENT_LONG" routine
> expects that "i" is a 32-bit aligned pointer.
>
> Routines such as "__ATOMIC_INCREMENT_LONG" may be used by code that is dealing
> with data that is declared "volatile", such as data which is shared between
> multiple code threads.
>
> So perhaps you could get a ROPRAND error in cases where you have data which
> isn't aligned but is volatile.
>
> I don't know whether either of the above apply to your situation, but hopefully
> you might be able to at least say "no, that definitely isn't relevant".
>
> In the meantime, I am still looking into this.
>
> Here are some other questions for you:
>
> - Is there any chance of your supplying me with a backup saveset that would
> reproduce the problem?
>
> - Can you reproduce the problem outside the ACMS environment?
>
> - Is this code that has worked before and just now stopped working? If
> so, what is different?
>
> Regards
>
> Nick Hudson
> Digital Software Partner Engineering.
Hello Mr. Hudson
Thanks for investigating on my question.
First to answer your questions:
About the described situation: Data is not coming from a VAX.
Other questions:
- There is needed too much effort to isolate the needed parts of our
system to reproduce the problem at your site. Maybe it would be possible
to reproduce the problem totally independant of the existing code.
- see above
- Since we are migrating from VAX to ALPHA and therefore using DECforms
instead of TDMS, all DECforms code and some ACMS Task code are new. All
application code (COBOL) is left as it was on the VAX.
I investigated a bit more on the problem too. In the meantime I managed
to build the application to run it without ACMS debugger. The crash
happend too.
But, in the Task debugger I did following (and was surprised):
- set break <last line of COBOL module before entering the form>
- examine STC-ABTEILUNG-9 of USER-WORKSPACE -> 0
- examine/binary STC-ABTEILUNG-9 of USER-WORKSPACE -> 00100000
- deposit STC-ABTEILUNG-9 of USER-WORKSPACE = 0
- examine/binary STC-ABTEILUNG-9 of USER-WORKSPACE -> 00110000
- go -> is working fine
So whats the problem with the high nibble?
Regards
Juerg Fruehauf
--
--------------------------------------------------------------------
/\
Juerg Fruehauf /--\KROS AG Tel: +41 32 329 90 43
SW-Developper unterer Quai 37 Fax: +41 32 329 90 35
CH-2502 Biel email: [email protected]
--------------------------------------------------------------------
% ====== Internet headers and postmarks (see DECWRL::GATEWAY.DOC) ======
% Received: from mail.vbo.dec.com (mail.vbo.dec.com [16.36.208.34]) by
vbormc.vbo.dec.com (8.7.3/8.7) with ESMTP id LAA20000 for
<[email protected]>; Thu, 13 Mar 1997 11:19:46 +0100
% Received: from server21.digital.fr (server21.digital.fr [193.56.15.21]) by
mail.vbo.dec.com (8.7.3/8.7) with ESMTP id LAA08575 for
<[email protected]>; Thu, 13 Mar 1997 11:24:28 +0100 (MET)
% Received: from mail.eunet.ch (mail.eunet.ch [146.228.10.7]) by
server21.digital.fr (8.7.5/8.7) with ESMTP id LAA00035 for
<[email protected]>; Thu, 13 Mar 1997 11:27:13 +0100 (MET)
% Received: from dyna-bi-11.dial.eunet.ch by mail.eunet.ch (8.8.3/1.34) id
KAA20815; Thu, 13 Mar 1997 10:19:33 GM
% Message-ID: <[email protected]>
% Date: Fri, 14 Mar 1997 11:18:48 -0500
% From: AKROS AG <[email protected]>
% Reply-To: [email protected]
% Organization: AKROS AG
% X-Mailer: Mozilla 3.0Gold (Win16; I)
% MIME-Version: 1.0
% To: "[email protected] - UK Software Partner Engineering 830-4121
12-Mar-1997 1054 +0000" <[email protected]>
% Subject: Re: ESCALATION: POINT No22329, ROPRAND
% References: <[email protected]>
% Content-Type: text/plain; charset=us-ascii
% Content-Transfer-Encoding: 7bit
|