[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | Mathematics at DEC |
|
Moderator: | RUSURE::EDP |
|
Created: | Mon Feb 03 1986 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 2083 |
Total number of notes: | 14613 |
1926.0. "Kahan Summation Summary" by WRKSYS::ROTH (Geometry is the real life!) Thu Dec 29 1994 17:30
Newsgroups: sci.math.num-analysis
Subject: Re: emulation of double-precision using trick by D.Knuth ?
Date: 13 Sep 1994 16:04:56 GMT
Organization: University of Wuppertal
Since I was asked to post responses to my question, I quote Douglas
Priest's mail below. He provided me with an excellent bibliography as
well as some C++ source-code. Thanks a lot to everybody for helping me
with this problem, especially Douglas Priest, Fred Tydeman, Wayne Hayes
and Norbert Juffa who sent very valuable references and suggestions!
With best regards
Uwe Glaessner
Physics Department
University of Wuppertal
--------------------------------------------------------------------------
[email protected] wrote:
Here is a C++ implementation of simulated double precision (what W. Kahan
calls "doubled-precision" or "DP") versions of addition, multiplication,
and square root using ordinary double precision arithmetic as a basis.
Note that the addition algorithm shown below is accurate to nearly twice
the underlying precision even when extreme cancellation occurs; this is
not true of the algorithm given by Dekker in his 1971 paper (see below).
--------- dp.C ---------
class longdouble
{
public:
double hi, lo;
};
/* Try both of the following two versions of operator+; define BRANCH
if the first one is faster. */
#ifdef BRANCH
/*
* This add routine appears in some notes Kahan wrote in 1989.
* Kahan claims this DP add is accurate to within a little more
* than one ulp in the tail of the sum.
*
* (C) W. Kahan 1989
*
* NOTICE:
* Copyrighted programs may not be translated, used, nor
* reproduced without the author's permission. Normally that
* permission is granted freely for academic and scientific
* purposes subject to the following three requirements:
* 1. This NOTICE and the copyright notices must remain attached
* to the programs and their translations.
* 2. Users of such a program should regard themselves as voluntary
* participants in the author's researches and so are obliged
* to report their experience with the program back to the author.
* 3. Neither the author nor the institution that employs him will
* be held responsible for the consequences of using a program
* for which neither has received payment.
* Would-be commercial users of these programs must correspond
* with the author to arrange terms of payment and warranty.
*/
longdouble operator +( longdouble x, longdouble y )
{
longdouble z;
double H, h, T, t, S, s;
S = x.hi + y.hi;
s = ( ( fabs( x.hi ) < fabs( y.hi ) )? x.hi + ( y.hi - S ) :
y.hi + ( x.hi - S ) );
T = x.lo + y.lo;
t = ( ( fabs( x.lo ) < fabs( y.lo ) )? x.lo + ( y.lo - T ) :
y.lo + ( x.lo - T ) );
H = S + ( s + T );
h = ( s + T ) + ( S - H );
z.hi = H + ( t + h );
z.lo = ( t + h ) + ( H - z.hi );
return z;
}
#else
/*
* Here I've replaced the magnitude-swap, 3-add method of
* computing the roundoff of a sum with Knuth's branchless,
* 6-add method. The code is otherwise the same as that
* above. On some machines, this version will be faster.
*/
longdouble operator +( longdouble x, longdouble y )
{
longdouble z;
double H, h, T, t, S, s, e, f;
S = x.hi + y.hi;
T = x.lo + y.lo;
e = S - x.hi;
f = T - x.lo;
s = ( y.hi - e ) + ( x.hi - ( S - e ) );
t = ( y.lo - f ) + ( x.lo - ( T - f ) );
H = S + ( s + T );
h = ( s + T ) + ( S - H );
z.hi = H + ( t + h );
z.lo = ( t + h ) + ( H - z.hi );
return z;
}
#endif
/*
* The following algorithms appear in Dekker, T., A Floating
* Point Technique for Extending the Available Precision,
* Numer. Math. 18 (1971), pp. 224-242.
*/
#define Split 134217729.0 /* 2^27 + 1, appropriate for IEEE double */
longdouble operator *( longdouble x, longdouble y )
{
longdouble z;
double hx, tx, hy, ty, C, c;
C = Split * x.hi;
hx = C - ( C - x.hi );
tx = x.hi - hx;
c = Split * y.hi;
hy = c - ( c - y.hi );
ty = y.hi - hy;
C = x.hi * y.hi;
c = ( ( ( ( hx * hy - C ) + hx * ty ) + tx * hy ) + tx * ty )
+ ( x.hi * y.lo + x.lo + y.hi );
z.hi = C + c;
z.lo = c - ( C - z.hi );
return z;
}
longdouble operator /( longdouble x, longdouble y )
{
longdouble z;
double hc, tc, hy, ty, C, c, U, u;
C = x.hi / y.hi;
c = Split * C;
hc = c - ( c - C );
tc = C - hc;
u = Split * y.hi;
hy = u - ( u - y.hi );
ty = y.hi - hy;
U = C * y.hi;
u = ( ( ( hc * hy - U ) + hc * ty ) + tc * hy ) + tc * ty;
c = ( ( ( x.hi - U ) - u ) + x.lo - C * y.lo ) / y.hi;
z.hi = C + c;
z.lo = c - ( C - z.hi );
return z;
}
longdouble sqrtl( longdouble x )
{
longdouble z;
double hc, tc, C, c, U, u;
C = sqrt( x.hi );
c = Split * C;
hc = c - ( c - C );
tc = C - hc;
U = C * C;
u = ( ( hc * hc - U ) + hc * tc + hc * tc ) + tc * tc;
c = ( ( ( x.hi - U ) - u ) + x.lo ) / ( C + C );
z.hi = C + c;
z.lo = c - ( C - z.hi );
return z;
}
--------------------------
Here is a brief annotated bibliography of papers dealing with DP and
similar techniques, arranged chronologically.
Kahan, W., Further Remarks on Reducing Truncation Errors,
{\it Comm.\ ACM\/} {\bf 8} (1965), 40.
M{\o}ller, O., Quasi Double Precision in Floating-Point Addition,
{\it BIT\/} {\bf 5} (1965), 37--50.
The two papers that first presented the idea of recovering the
roundoff of a sum.
Dekker, T., A Floating-Point Technique for Extending the Available
Precision, {\it Numer.\ Math.} {\bf 18} (1971), 224--242.
The classic reference for DP algorithms for sum, product, quotient,
and square root.
Pichat, M., Correction d'une Somme en Arithmetique \`a Virgule
Flottante, {\it Numer.\ Math.} {\bf 19} (1972), 400--406.
An iterative algorithm for computing a protracted sum to working
precision by repeatedly applying the sum-and-roundoff method.
Linnainmaa, S., Analysis of Some Known Methods of Improving the Accuracy
of Floating-Point Sums, {\it BIT\/} {\bf 14} (1974), 167--202.
Comparison of Kahan and M{\o}ller algorithms with variations given
by Knuth.
Bohlender, G., Floating-Point Computation of Functions with Maximum
Accuracy, {\it IEEE Trans.\ Comput.} {\bf C-26} (1977), 621--632.
Extended the analysis of Pichat's algorithm to compute a multi-word
representation of the exact sum of n working precision numbers.
This is the algorithm Kahan has called "distillation".
Linnainmaa, S., Software for Doubled-Precision Floating-Point Computations,
{\it ACM Trans.\ Math.\ Soft.} {\bf 7} (1981), 272--283.
Generalized the hypotheses of Dekker and showed how to take advantage
of extended precision where available.
Leuprecht, H., and W.~Oberaigner, Parallel Algorithms for the Rounding-Exact
Summation of Floating-Point Numbers, {\it Computing} {\bf 28} (1982), 89--104.
Variations of distillation appropriate for parallel and vector
architectures.
Kahan, W., Paradoxes in Concepts of Accuracy, lecture notes from Joint
Seminar on Issues and Directions in Scientific Computation, Berkeley, 1989.
Gives the more accurate DP sum I've shown above, discusses some
examples.
Priest, D., Algorithms for Arbitrary Precision Floating Point Arithmetic,
in P.~Kornerup and D.~Matula, Eds., {\it Proc.\ 10th Symposium on Com-
puter Arithmetic}, IEEE Computer Society Press, Los Alamitos, Calif., 1991.
Extends from DP to arbitrary precision; gives portable algorithms and
general proofs.
Sorensen, D., and P.~Tang, On the Orthogonality of Eigenvectors Computed
by Divide-and-Conquer Techniques, {\it SIAM J.\ Num.\ Anal.} {\bf 28}
(1991), 1752--1775.
Uses some DP arithmetic to retain orthogonality of eigenvectors
computed by a parallel divide-and-conquer scheme.
Priest, D., On Properties of Floating Point Arithmetics: Numerical Stability
and the Cost of Accurate Computations, Ph.D. dissertation, University
of California at Berkeley, 1992.
More examples, organizes proofs in terms of common properties of fp
addition/subtraction, gives other summation algorithms.
If you're interested, I can provide a Postscript copy of my ARITH10 paper.
My Ph.D. dissertation is available via anonymous ftp from
ftp.icsi.berkeley.edu
If I can provide any additional information, please let me know.
Since I am interested in applications of these techniques, I would greatly
appreciate hearing of any work you do using them. In particular, please
let me know if you publish a discussion of your application.
Regards,
Douglas M. Priest
SunSoft Floating Point and Performance Group
Sun Microsystems, Inc.
(but only speaking for myself)
T.R | Title | User | Personal Name | Date | Lines
|
---|