[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference turris::fortran

Title:Digital Fortran
Notice:Read notes 1.* for important information
Moderator:QUARK::LIONEL
Created:Thu Jun 01 1995
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:1333
Total number of notes:6734

1300.0. "Round toward zero issue" by NNTPD::"[email protected]" (ofer) Thu May 15 1997 08:21

Hi,

Got a question from a customer regarding very small numbers.

Looks like results on small numbers are different on IBM vs DEC and when it 
goes smaller than 1.E-16 our compiler zeroes the results.
Example1:
IBM :
    .893939132225166966D+00    .908678245961530948D-16
DIGITAL :    
   0.893939132225166966E+00   0.111022302462515654E-15

Example2:
IBM :
    .893933189833253428E+00   -.248289637591051155E-16
DIGITAL :
      0.893933189833253428E+00   0.000000000000000000E+00

Env: TL 8400, DU4.0B, Fortran compiler v4.x, 2GB mem

Any idea for the difference between IBM and us?

Thanks,
	Ofer. (see program details below)

  implicit double precision (a-h,o-z)
      dimension dx(1),dy(1)
      n=1
      incx=1
      incy=1
      c=0.894464091309414289D+00
      s=0.447139787267945732D+00
      dx(1)=0.799596453591710277D+00
      dy(1)=0.399715753413653230D+00
      write(6,'(4(2x,D25.18))') c,s,dx(1),dy(1)
      call drot(n,dx,incx,dy,incy,c,s)
      write(6,'(2(2x,D25.18))') dx(1),dy(1)
      stop
      end
      subroutine drot(n,dx,incx,dy,incy,c,s)
c
      implicit double precision (a-h,o-z), integer (i-n)
      dimension dx(*),dy(*)
c
c     applies a plane rotation.
c           dx(i) =  c*dx(i) + s*dy(i)
c           dy(i) = -s*dx(i) + c*dy(i)
c     jack dongarra, linpack, 3/11/78.
c
      if(n.le.0)return
      if(incx.eq.1.and.incy.eq.1)go to 20
c
c     code for unequal increments or equal increments not equal to 1
c
      ix = 1
      iy = 1
      if(incx.lt.0)ix = (-n+1)*incx + 1
      if(incy.lt.0)iy = (-n+1)*incy + 1
      do 10 i = 1,n
        dtemp = c*dx(ix) + s*dy(iy)
        dy(iy) = c*dy(iy) - s*dx(ix)
        dx(ix) = dtemp
        ix = ix + incx
        iy = iy + incy
   10 continue    
      return
c
c     code for both increments equal to 1
c
   20 do 30 i = 1,n
        dtemp = c*dx(i) + s*dy(i)
        dy(i) = c*dy(i) - s*dx(i)
        dx(i) = dtemp
   30 continue
      return
      end


On IBM I received following results


    .893939132225166966D+00    .908678245961530948D-16


On DIGITAL     


   0.893939132225166966E+00   0.111022302462515654E-15


When I changed the input for subroutine drot 

      c=0.894464769321342223D+00
      s=0.447138430961730338D+00
      dx(1)=0.799591744432892604D+00
      dy(1)=0.399711883886655517D+00


I received following on IBM

    .893933189833253428E+00   -.248289637591051155E-16


 on DIGITAL

             0.893933189833253428E+00   0.000000000000000000E+00


[Posted by WWW Notes gateway]
T.RTitleUserPersonal
Name
DateLines
1300.1RTL::HANEKThu May 15 1997 10:0841
This appears to be the result of IBM using its extended precision muladd
instuction to compute the values dx(i) and dy(i) in the drot routine

If the input values you specified:

      c=0.894464769321342223D+00
      s=0.447138430961730338D+00
      dx(1)=0.799591744432892604D+00
      dy(1)=0.399711883886655517D+00

are correctly converted to double precision, then in hex, ieee format

	c     = 3fec9f7494751963			(1)
	s     = 3fdc9dea8270fc14
	dx(1) = 3fe996416d0fb528
	dy(1) = 3fd994e127476833

If the products are computed in double precision, then

	c*dy(1) = 3fd6e1bdf3780256
	s*dx(1) = 3fd6e1bdf3780256

so that dy(1) = c*dy(1) - s*dx(1) = 0, which is exactly what alpha is getting.

My guess is that the code sequence for the IBM is something like:

	t     = s*dx(1)		! where t is double precision
	dy(1) = c*dy(1) - t	! where this operation is done with the mulsub.

Some implementations of the IBM muladd (or mulsub) instruction compute the
product to twice the base precision (i.e. 106 bits in this case), perform the
addition (or subtraction) in twice the base precision, and then round back down
to base precision (53 bits in this) case.  If we take that approach with the
hex valuesin (1), then we get:

	dy(1) = bc7ca03775cd728e (hex) = -.248289637591051155E-16 (base 10)

which is what the IBM machine is getting.
	


1300.2WIBBIN::NOYCEPulling weeds, pickin' stonesThu May 15 1997 11:102
In other words, IBM rounds one product before subtracting, but doesn't
round the other product.
1300.3ThanksNNTPD::"[email protected]"oferThu May 15 1997 12:079
Thanks for the clear explanation.

Is there any way to imitate IBM behavior (results) with our compiler? 
"f77 -r16 prog.f" alone , of-course wouldn't work.

	Ofer.
	
	
[Posted by WWW Notes gateway]
1300.4Keep the interface, but use REAL*16 internallyWIBBIN::NOYCEPulling weeds, pickin' stonesThu May 15 1997 12:5570
You could make the drot routine do its calculations in real*16.  I defined
a "qprod" function, returning the real*16 product of two real*8 values,
to make this more readable:

      implicit double precision (a-h,o-z)
      dimension dx(1),dy(1)
      n=1
      incx=1
      incy=1

c      c=0.894464091309414289D+00
c      s=0.447139787267945732D+00
c      dx(1)=0.799596453591710277D+00
c      dy(1)=0.399715753413653230D+00

      c=0.894464769321342223D+00
      s=0.447138430961730338D+00
      dx(1)=0.799591744432892604D+00
      dy(1)=0.399711883886655517D+00

      write(6,'(4(2x,D25.18))') c,s,dx(1),dy(1)
      call drot(n,dx,incx,dy,incy,c,s)
      write(6,'(2(2x,D25.18))') dx(1),dy(1)
      stop
      end
      subroutine drot(n,dx,incx,dy,incy,c,s)
c
      implicit double precision (a-h,o-z), integer (i-n)
      dimension dx(*),dy(*)
c
c     applies a plane rotation.
c           dx(i) =  c*dx(i) + s*dy(i)
c           dy(i) = -s*dx(i) + c*dy(i)
c     jack dongarra, linpack, 3/11/78.
c
      real*16 qprod
      qprod(qa,qb) = qext(qa)*qext(qb)

      if(n.le.0)return
      if(incx.eq.1.and.incy.eq.1)go to 20
c
c     code for unequal increments or equal increments not equal to 1
c
      ix = 1
      iy = 1
      if(incx.lt.0)ix = (-n+1)*incx + 1
      if(incy.lt.0)iy = (-n+1)*incy + 1
      do 10 i = 1,n
	dtemp = qprod(c,dx(ix)) + qprod(s,dy(iy))
	dy(iy) = qprod(c,dy(iy)) - qprod(s,dx(ix))
        dx(ix) = dtemp
        ix = ix + incx
        iy = iy + incy
   10 continue    
      return
c
c     code for both increments equal to 1
c
   20 do 30 i = 1,n
	dtemp = qprod(c,dx(i)) + qprod(s,dy(i))
	dy(i) = qprod(c,dy(i)) - qprod(s,dx(i))
        dx(i) = dtemp
   30 continue
      return
      end

This version doesn't get exactly IBM's result, since it computes both
products exactly, and doesn't round until after the add or subtract.
If you want to get IBM's result, change one qprod in each line to a
straight multiply (see .1 to pick which one).
1300.5NNTPD::"[email protected]"oferThu May 15 1997 13:3953
The only form I get it right which is rather weird is:

real*16=real*16 +real*8
real*16=real*16 -real*8

Don't understand it but it works:

    subroutine drot(n,dx,incx,dy,incy,c,s)
      implicit double precision (a-h,o-z), integer (i-n)
      real*8 dtemp
      dimension dx(*),dy(*)
      real*16 dx1(n),dy1(n),c1,s1,dtemp1,ttx2,tty2

      if(n.le.0)return
      c1=c
      s1=s
      do i=1,n
         dy1(i)=dy(i)
         dx1(i)=dx(i)
      enddo

      if(incx.eq.1.and.incy.eq.1)go to 20
c
c     code for unequal increments or equal increments not equal to 1
c
      ix = 1
      iy = 1
      if(incx.lt.0)ix = (-n+1)*incx + 1
      if(incy.lt.0)iy = (-n+1)*incy + 1
      
c
c     code for both increments equal to 1
c
c Note ttx2 and tty2 are real*16, ttx1 and tty1 are real*8
     20 do 30 i = 1,n
        ttx1=s1*dy1(i)
        ttx2=c1*dx1(i) 
        dtemp1 = ttx2 + ttx1
        tty1=s1*dx1(i)
        tty2=c1*dy1(i) 
        dy1(i) = tty2 -tty1
        dx1(i) = dtemp1
   30 continue
      do i=1,n
         dy(i)=dy1(i)
         dx(i)=dx1(i)
      enddo
      return
      end



[Posted by WWW Notes gateway]
1300.6Simpler versionWIBBIN::NOYCEPulling weeds, pickin' stonesThu May 15 1997 14:5366
You don't need to copy dx and dy to REAL*16 arrays to match IBM's result.
The program in .4 gives more accurate results, but if you want to match
IBM's result nore exactly, try this version.  It rounds one product, but
not the other one.

      implicit double precision (a-h,o-z)
      dimension dx(1),dy(1)
      n=1
      incx=1
      incy=1

c      c=0.894464091309414289D+00
c      s=0.447139787267945732D+00
c      dx(1)=0.799596453591710277D+00
c      dy(1)=0.399715753413653230D+00

      c=0.894464769321342223D+00
      s=0.447138430961730338D+00
      dx(1)=0.799591744432892604D+00
      dy(1)=0.399711883886655517D+00

      write(6,'(4(2x,D25.18))') c,s,dx(1),dy(1)
      call drot(n,dx,incx,dy,incy,c,s)
      write(6,'(2(2x,D25.18))') dx(1),dy(1)
      stop
      end
      subroutine drot(n,dx,incx,dy,incy,c,s)
c
      implicit double precision (a-h,o-z), integer (i-n)
      dimension dx(*),dy(*)
c
c     applies a plane rotation.
c           dx(i) =  c*dx(i) + s*dy(i)
c           dy(i) = -s*dx(i) + c*dy(i)
c     jack dongarra, linpack, 3/11/78.
c
      real*16 qprod
      qprod(qa,qb) = qext(qa)*qext(qb)

      if(n.le.0)return
      if(incx.eq.1.and.incy.eq.1)go to 20
c
c     code for unequal increments or equal increments not equal to 1
c
      ix = 1
      iy = 1
      if(incx.lt.0)ix = (-n+1)*incx + 1
      if(incy.lt.0)iy = (-n+1)*incy + 1
      do 10 i = 1,n
	dtemp = qprod(c,dx(ix)) + s*dy(iy)
	dy(iy) = qprod(c,dy(iy)) - s*dx(ix)
        dx(ix) = dtemp
        ix = ix + incx
        iy = iy + incy
   10 continue    
      return
c
c     code for both increments equal to 1
c
   20 do 30 i = 1,n
	dtemp = qprod(c,dx(i)) + s*dy(i)
	dy(i) = qprod(c,dy(i)) - s*dx(i)
        dx(i) = dtemp
   30 continue
      return
      end
1300.7HPCGRP::MANLEYThu May 15 1997 19:326
> The only form I get it right which is rather weird is:

  Please define "right". Do you think IBM is "right", even though their
  IEEE results differ from everyone else's IEEE results?

1300.8right..you are rightNNTPD::"[email protected]"oferFri May 16 1997 05:2011
Dwight,

Of-course... the only "right" is that you are right on this issue. See , 
the customer has several IBM machines and one TL8400, since he developed his
progarm
on IBM , he ask for IBM compatible results. I agree it has nothing to do 
with rightness. 
Anyway, Bill, Bob and Dwight thanks for the help.

	Ofer.
[Posted by WWW Notes gateway]