[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]
Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613
1926.0. "Kahan Summation Summary" by WRKSYS::ROTH (Geometry is the real life!) Thu Dec 29 1994 17:30

Newsgroups: sci.math.num-analysis
Subject: Re: emulation of double-precision using trick by D.Knuth ?
Date: 13 Sep 1994 16:04:56 GMT
Organization: University of Wuppertal
 
Since I was asked to post responses to my question, I quote Douglas 
Priest's mail below. He provided me with an excellent bibliography as 
well as some C++ source-code. Thanks a lot to everybody for helping me 
with this problem, especially Douglas Priest, Fred Tydeman, Wayne Hayes 
and Norbert Juffa who sent very valuable references and suggestions!
 
With best regards
 
Uwe Glaessner
Physics Department
University of Wuppertal
 
--------------------------------------------------------------------------
 
[email protected] wrote:
 
 
Here is a C++ implementation of simulated double precision (what W. Kahan
calls "doubled-precision" or "DP") versions of addition, multiplication,
and square root using ordinary double precision arithmetic as a basis.
Note that the addition algorithm shown below is accurate to nearly twice
the underlying precision even when extreme cancellation occurs; this is
not true of the algorithm given by Dekker in his 1971 paper (see below).
 
 
--------- dp.C ---------
 
class	longdouble
{
public:
	double	hi, lo;
};
 
/* Try both of the following two versions of operator+; define BRANCH
   if the first one is faster. */
#ifdef BRANCH
 
/*
*	This add routine appears in some notes Kahan wrote in 1989.
*	Kahan claims this DP add is accurate to within a little more
*	than one ulp in the tail of the sum.
*
*	(C) W. Kahan 1989
*
*	NOTICE:
*	Copyrighted programs may not be translated, used, nor
*	reproduced without the author's permission.  Normally that
*	permission is granted freely for academic and scientific
*	purposes subject to the following three requirements:
*	1. This NOTICE and the copyright notices must remain attached
*	   to the programs and their translations.
*	2. Users of such a program should regard themselves as voluntary
*	   participants in the author's researches and so are obliged
*	   to report their experience with the program back to the author.
*	3. Neither the author nor the institution that employs him will
*	   be held responsible for the consequences of using a program
*	   for which neither has received payment.
*	Would-be commercial users of these programs must correspond
*	with the author to arrange terms of payment and warranty.
*/
 
longdouble	operator +( longdouble x, longdouble y )
{
	longdouble	z;
	double		H, h, T, t, S, s;
 
	S = x.hi + y.hi;
	s = ( ( fabs( x.hi ) < fabs( y.hi ) )? x.hi + ( y.hi - S ) :
		y.hi + ( x.hi - S ) );
	T = x.lo + y.lo;
	t = ( ( fabs( x.lo ) < fabs( y.lo ) )? x.lo + ( y.lo - T ) :
		y.lo + ( x.lo - T ) );
	H = S + ( s + T );
	h = ( s + T ) + ( S - H );
	z.hi = H + ( t + h );
	z.lo = ( t + h ) + ( H - z.hi );
	return z;
}
 
#else
 
/*
*	Here I've replaced the magnitude-swap, 3-add method of
*	computing the roundoff of a sum with Knuth's branchless,
*	6-add method.  The code is otherwise the same as that
*	above.  On some machines, this version will be faster.
*/
 
longdouble	operator +( longdouble x, longdouble y )
{
	longdouble	z;
	double		H, h, T, t, S, s, e, f;
 
	S = x.hi + y.hi;
	T = x.lo + y.lo;
	e = S - x.hi;
	f = T - x.lo;
	s = ( y.hi - e ) + ( x.hi - ( S - e ) );
	t = ( y.lo - f ) + ( x.lo - ( T - f ) );
	H = S + ( s + T );
	h = ( s + T ) + ( S - H );
	z.hi = H + ( t + h );
	z.lo = ( t + h ) + ( H - z.hi );
	return z;
}
 
#endif
 
/*
*	The following algorithms appear in Dekker, T., A Floating
*	Point Technique for Extending the Available Precision,
*	Numer. Math. 18 (1971), pp. 224-242.
*/
 
#define Split	134217729.0	/* 2^27 + 1, appropriate for IEEE double */
 
longdouble	operator *( longdouble x, longdouble y )
{
	longdouble	z;
	double		hx, tx, hy, ty, C, c;
 
	C = Split * x.hi;
	hx = C - ( C - x.hi );
	tx = x.hi - hx;
	c = Split * y.hi;
	hy = c - ( c - y.hi );
	ty = y.hi - hy;
	C = x.hi * y.hi;
	c = ( ( ( ( hx * hy - C ) + hx * ty ) + tx * hy ) + tx * ty )
	    + ( x.hi * y.lo + x.lo + y.hi );
	z.hi = C + c;
	z.lo = c - ( C - z.hi );
	return z;
}
 
longdouble	operator /( longdouble x, longdouble y )
{
	longdouble	z;
	double		hc, tc, hy, ty, C, c, U, u;
 
	C = x.hi / y.hi;
	c = Split * C;
	hc = c - ( c - C );
	tc = C - hc;
	u = Split * y.hi;
	hy = u - ( u - y.hi );
	ty = y.hi - hy;
	U = C * y.hi;
	u = ( ( ( hc * hy - U ) + hc * ty ) + tc * hy ) + tc * ty;
	c = ( ( ( x.hi - U ) - u ) + x.lo - C * y.lo ) / y.hi;
	z.hi = C + c;
	z.lo = c - ( C - z.hi );
	return z;
}
 
longdouble	sqrtl( longdouble x )
{
	longdouble	z;
	double		hc, tc, C, c, U, u;
 
	C = sqrt( x.hi );
	c = Split * C;
	hc = c - ( c - C );
	tc = C - hc;
	U = C * C;
	u = ( ( hc * hc - U ) + hc * tc + hc * tc ) + tc * tc;
	c = ( ( ( x.hi - U ) - u ) + x.lo ) / ( C + C );
	z.hi = C + c;
	z.lo = c - ( C - z.hi );
	return z;
}
 
--------------------------
 
Here is a brief annotated bibliography of papers dealing with DP and
similar techniques, arranged chronologically.
 
Kahan, W., Further Remarks on Reducing Truncation Errors,
  {\it Comm.\ ACM\/} {\bf 8} (1965), 40.
 
M{\o}ller, O., Quasi Double Precision in Floating-Point Addition,
  {\it BIT\/} {\bf 5} (1965), 37--50.
 
  The two papers that first presented the idea of recovering the
  roundoff of a sum.
 
Dekker, T., A Floating-Point Technique for Extending the Available
  Precision, {\it Numer.\ Math.} {\bf 18} (1971), 224--242.
 
  The classic reference for DP algorithms for sum, product, quotient,
  and square root.
 
Pichat, M., Correction d'une Somme en Arithmetique \`a Virgule
  Flottante, {\it Numer.\ Math.} {\bf 19} (1972), 400--406.
 
  An iterative algorithm for computing a protracted sum to working
  precision by repeatedly applying the sum-and-roundoff method.
 
Linnainmaa, S., Analysis of Some Known Methods of Improving the Accuracy
  of Floating-Point Sums, {\it BIT\/} {\bf 14} (1974), 167--202.
 
  Comparison of Kahan and M{\o}ller algorithms with variations given
  by Knuth.
 
Bohlender, G., Floating-Point Computation of Functions with Maximum
  Accuracy, {\it IEEE Trans.\ Comput.} {\bf C-26} (1977), 621--632.
 
  Extended the analysis of Pichat's algorithm to compute a multi-word
  representation of the exact sum of n working precision numbers.
  This is the algorithm Kahan has called "distillation".
 
Linnainmaa, S., Software for Doubled-Precision Floating-Point Computations,
  {\it ACM Trans.\ Math.\ Soft.} {\bf 7} (1981), 272--283.
 
  Generalized the hypotheses of Dekker and showed how to take advantage
  of extended precision where available.
 
Leuprecht, H., and W.~Oberaigner, Parallel Algorithms for the Rounding-Exact
  Summation of Floating-Point Numbers, {\it Computing} {\bf 28} (1982), 89--104.
 
  Variations of distillation appropriate for parallel and vector
  architectures.
 
Kahan, W., Paradoxes in Concepts of Accuracy, lecture notes from Joint
  Seminar on Issues and Directions in Scientific Computation, Berkeley, 1989.
 
  Gives the more accurate DP sum I've shown above, discusses some
  examples.
 
Priest, D., Algorithms for Arbitrary Precision Floating Point Arithmetic,
  in P.~Kornerup and D.~Matula, Eds., {\it Proc.\ 10th Symposium on Com-
  puter Arithmetic}, IEEE Computer Society Press, Los Alamitos, Calif., 1991.
 
  Extends from DP to arbitrary precision; gives portable algorithms and
  general proofs.
 
Sorensen, D., and P.~Tang, On the Orthogonality of Eigenvectors Computed
  by Divide-and-Conquer Techniques, {\it SIAM J.\ Num.\ Anal.} {\bf 28}
  (1991), 1752--1775.
 
  Uses some DP arithmetic to retain orthogonality of eigenvectors
  computed by a parallel divide-and-conquer scheme.
 
Priest, D., On Properties of Floating Point Arithmetics: Numerical Stability
  and the Cost of Accurate Computations, Ph.D. dissertation, University
  of California at Berkeley, 1992.
 
  More examples, organizes proofs in terms of common properties of fp
  addition/subtraction, gives other summation algorithms.
 
 
If you're interested, I can provide a Postscript copy of my ARITH10 paper.
My Ph.D. dissertation is available via anonymous ftp from
 
  ftp.icsi.berkeley.edu
 
If I can provide any additional information, please let me know.
 
Since I am interested in applications of these techniques, I would greatly
appreciate hearing of any work you do using them.  In particular, please
let me know if you publish a discussion of your application.
 
Regards,
 
Douglas M. Priest
SunSoft Floating Point and Performance Group
Sun Microsystems, Inc.
(but only speaking for myself)
T.R	Title	User	Personal Name	Date	Lines