T.R | Title | User | Personal Name | Date | Lines |
---|
1300.1 | | RTL::HANEK | | Thu May 15 1997 10:08 | 41 |
| This appears to be the result of IBM using its extended precision muladd
instuction to compute the values dx(i) and dy(i) in the drot routine
If the input values you specified:
c=0.894464769321342223D+00
s=0.447138430961730338D+00
dx(1)=0.799591744432892604D+00
dy(1)=0.399711883886655517D+00
are correctly converted to double precision, then in hex, ieee format
c = 3fec9f7494751963 (1)
s = 3fdc9dea8270fc14
dx(1) = 3fe996416d0fb528
dy(1) = 3fd994e127476833
If the products are computed in double precision, then
c*dy(1) = 3fd6e1bdf3780256
s*dx(1) = 3fd6e1bdf3780256
so that dy(1) = c*dy(1) - s*dx(1) = 0, which is exactly what alpha is getting.
My guess is that the code sequence for the IBM is something like:
t = s*dx(1) ! where t is double precision
dy(1) = c*dy(1) - t ! where this operation is done with the mulsub.
Some implementations of the IBM muladd (or mulsub) instruction compute the
product to twice the base precision (i.e. 106 bits in this case), perform the
addition (or subtraction) in twice the base precision, and then round back down
to base precision (53 bits in this) case. If we take that approach with the
hex valuesin (1), then we get:
dy(1) = bc7ca03775cd728e (hex) = -.248289637591051155E-16 (base 10)
which is what the IBM machine is getting.
|
1300.2 | | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Thu May 15 1997 11:10 | 2 |
| In other words, IBM rounds one product before subtracting, but doesn't
round the other product.
|
1300.3 | Thanks | NNTPD::"[email protected]" | ofer | Thu May 15 1997 12:07 | 9 |
| Thanks for the clear explanation.
Is there any way to imitate IBM behavior (results) with our compiler?
"f77 -r16 prog.f" alone , of-course wouldn't work.
Ofer.
[Posted by WWW Notes gateway]
|
1300.4 | Keep the interface, but use REAL*16 internally | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Thu May 15 1997 12:55 | 70 |
| You could make the drot routine do its calculations in real*16. I defined
a "qprod" function, returning the real*16 product of two real*8 values,
to make this more readable:
implicit double precision (a-h,o-z)
dimension dx(1),dy(1)
n=1
incx=1
incy=1
c c=0.894464091309414289D+00
c s=0.447139787267945732D+00
c dx(1)=0.799596453591710277D+00
c dy(1)=0.399715753413653230D+00
c=0.894464769321342223D+00
s=0.447138430961730338D+00
dx(1)=0.799591744432892604D+00
dy(1)=0.399711883886655517D+00
write(6,'(4(2x,D25.18))') c,s,dx(1),dy(1)
call drot(n,dx,incx,dy,incy,c,s)
write(6,'(2(2x,D25.18))') dx(1),dy(1)
stop
end
subroutine drot(n,dx,incx,dy,incy,c,s)
c
implicit double precision (a-h,o-z), integer (i-n)
dimension dx(*),dy(*)
c
c applies a plane rotation.
c dx(i) = c*dx(i) + s*dy(i)
c dy(i) = -s*dx(i) + c*dy(i)
c jack dongarra, linpack, 3/11/78.
c
real*16 qprod
qprod(qa,qb) = qext(qa)*qext(qb)
if(n.le.0)return
if(incx.eq.1.and.incy.eq.1)go to 20
c
c code for unequal increments or equal increments not equal to 1
c
ix = 1
iy = 1
if(incx.lt.0)ix = (-n+1)*incx + 1
if(incy.lt.0)iy = (-n+1)*incy + 1
do 10 i = 1,n
dtemp = qprod(c,dx(ix)) + qprod(s,dy(iy))
dy(iy) = qprod(c,dy(iy)) - qprod(s,dx(ix))
dx(ix) = dtemp
ix = ix + incx
iy = iy + incy
10 continue
return
c
c code for both increments equal to 1
c
20 do 30 i = 1,n
dtemp = qprod(c,dx(i)) + qprod(s,dy(i))
dy(i) = qprod(c,dy(i)) - qprod(s,dx(i))
dx(i) = dtemp
30 continue
return
end
This version doesn't get exactly IBM's result, since it computes both
products exactly, and doesn't round until after the add or subtract.
If you want to get IBM's result, change one qprod in each line to a
straight multiply (see .1 to pick which one).
|
1300.5 | | NNTPD::"[email protected]" | ofer | Thu May 15 1997 13:39 | 53 |
| The only form I get it right which is rather weird is:
real*16=real*16 +real*8
real*16=real*16 -real*8
Don't understand it but it works:
subroutine drot(n,dx,incx,dy,incy,c,s)
implicit double precision (a-h,o-z), integer (i-n)
real*8 dtemp
dimension dx(*),dy(*)
real*16 dx1(n),dy1(n),c1,s1,dtemp1,ttx2,tty2
if(n.le.0)return
c1=c
s1=s
do i=1,n
dy1(i)=dy(i)
dx1(i)=dx(i)
enddo
if(incx.eq.1.and.incy.eq.1)go to 20
c
c code for unequal increments or equal increments not equal to 1
c
ix = 1
iy = 1
if(incx.lt.0)ix = (-n+1)*incx + 1
if(incy.lt.0)iy = (-n+1)*incy + 1
c
c code for both increments equal to 1
c
c Note ttx2 and tty2 are real*16, ttx1 and tty1 are real*8
20 do 30 i = 1,n
ttx1=s1*dy1(i)
ttx2=c1*dx1(i)
dtemp1 = ttx2 + ttx1
tty1=s1*dx1(i)
tty2=c1*dy1(i)
dy1(i) = tty2 -tty1
dx1(i) = dtemp1
30 continue
do i=1,n
dy(i)=dy1(i)
dx(i)=dx1(i)
enddo
return
end
[Posted by WWW Notes gateway]
|
1300.6 | Simpler version | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Thu May 15 1997 14:53 | 66 |
| You don't need to copy dx and dy to REAL*16 arrays to match IBM's result.
The program in .4 gives more accurate results, but if you want to match
IBM's result nore exactly, try this version. It rounds one product, but
not the other one.
implicit double precision (a-h,o-z)
dimension dx(1),dy(1)
n=1
incx=1
incy=1
c c=0.894464091309414289D+00
c s=0.447139787267945732D+00
c dx(1)=0.799596453591710277D+00
c dy(1)=0.399715753413653230D+00
c=0.894464769321342223D+00
s=0.447138430961730338D+00
dx(1)=0.799591744432892604D+00
dy(1)=0.399711883886655517D+00
write(6,'(4(2x,D25.18))') c,s,dx(1),dy(1)
call drot(n,dx,incx,dy,incy,c,s)
write(6,'(2(2x,D25.18))') dx(1),dy(1)
stop
end
subroutine drot(n,dx,incx,dy,incy,c,s)
c
implicit double precision (a-h,o-z), integer (i-n)
dimension dx(*),dy(*)
c
c applies a plane rotation.
c dx(i) = c*dx(i) + s*dy(i)
c dy(i) = -s*dx(i) + c*dy(i)
c jack dongarra, linpack, 3/11/78.
c
real*16 qprod
qprod(qa,qb) = qext(qa)*qext(qb)
if(n.le.0)return
if(incx.eq.1.and.incy.eq.1)go to 20
c
c code for unequal increments or equal increments not equal to 1
c
ix = 1
iy = 1
if(incx.lt.0)ix = (-n+1)*incx + 1
if(incy.lt.0)iy = (-n+1)*incy + 1
do 10 i = 1,n
dtemp = qprod(c,dx(ix)) + s*dy(iy)
dy(iy) = qprod(c,dy(iy)) - s*dx(ix)
dx(ix) = dtemp
ix = ix + incx
iy = iy + incy
10 continue
return
c
c code for both increments equal to 1
c
20 do 30 i = 1,n
dtemp = qprod(c,dx(i)) + s*dy(i)
dy(i) = qprod(c,dy(i)) - s*dx(i)
dx(i) = dtemp
30 continue
return
end
|
1300.7 | | HPCGRP::MANLEY | | Thu May 15 1997 19:32 | 6 |
|
> The only form I get it right which is rather weird is:
Please define "right". Do you think IBM is "right", even though their
IEEE results differ from everyone else's IEEE results?
|
1300.8 | right..you are right | NNTPD::"[email protected]" | ofer | Fri May 16 1997 05:20 | 11 |
| Dwight,
Of-course... the only "right" is that you are right on this issue. See ,
the customer has several IBM machines and one TL8400, since he developed his
progarm
on IBM , he ask for IBM compatible results. I agree it has nothing to do
with rightness.
Anyway, Bill, Bob and Dwight thanks for the help.
Ofer.
[Posted by WWW Notes gateway]
|