| 128- Bit Arithmetic on DEC Platforms
Lynn Yarbrough 20-APR-1992
For 15 years at least, DEC has recognized the need to provide extended
precision arithmetic (up to 128 bits) on its systems and software
platforms. Recently, however, the market need for this capability has been
reassessed and our more recent systems and languages no longer provide 128-
bit capability. This situation has provoked a lot of discussion in several
DEC Notes conferences, notably the Alphanotes and FORTRAN conferences. My
purpose in writing this is to state the situation and make the DEC Math
community (including the Sales Support community) fully aware of what the
situation is.
Although the VAX architecture specifies how H-float operations are to be
performed, the implementations on various VAX models have differed
significantly. Although REAL*16 arithmetic, implemented as H-float, has
been available in hardware on certain VAXen, more often than not it is
implemented by emulation. For example, H-float has been done in hardware on
the VAX 8650 and 8850 and early models of the VAX 9000, but not on any of
the 3000, 4000, or 6000 systems. Emulation is of necessity very much slower
- by factors of 40-80 - than hardware implementations.
This situation is somewhat mitigated by the fact that the frequency of
codes that NEED 128-bit arithmetic is relatively small. On the other hand,
such programs are likely to be very important to the customer, and where
128-bit hardware is not available, the cost of running these programs may
become prohibitive. We have recently been involved in benchmark efforts in
which the lack of 128-bit hardware has caused our initial solutions to be
3-5 times as costly as existing solutions for this important class of
problems. In order to bid them we have had to come up with innovative
approaches to the 128-bit performance problem.
An important event in this history came with the most recent revision of
the VAX 9000 chipset. In this version, several changes were made to improve
the reliability of the 9000, at the cost of removing several H-float
instructions from the chip. Compounding the problem, the H-float emulator
on this class of 9000's behaved as though ALL of the H-float instructions
had been removed, emulating them even though they could have been executed
directly. Only recently has the emulator been corrected to execute H-float
instructions in the fastest available mode. In the interim, the most
cost-effective VAX for 128-bit applications was, I believe, the 8850, and
we have been forced into bidding that system for certain contracts where
the 9000 was too slow to do the job.
A few of our efforts to overcome the performance problems have been quite
successful. In one case we found that an IMSL library routine was
surreptitiously dropping into 128-bit mode (although the user specified 64-
bit precision) and negotiated with IMSL to provide a new routine which got
the same results using several 64-bit arithmetic operations in D-float
instead of emulating H-float. In another case, a careful numerical analysis
of the program showed that adequate results could be gotten reliably with
G-float arithmetic. Actually, it appears that the use of H-float is
invariably to get more precision in the fractional part of results, rather
than the extraordinary exponent range. We have found no examples of user
programs where exponents greater than 300, provided by G-float, are needed.
The Alpha architecture is quite different from the VAX in its arithmetic
capabilities. While the IEEE standard S (32-bit) and T (64-bit) floats are
added to the F- and G-float repetoire, the extended F-, or D-, 64-bit
format appears in emulation only, and there is NO support for 128-bit
arithmetic in either the Alpha architecture or in any of the Alpha-
supported languages. Alpha FORTRAN, C, PASCAL, etc. do not acknowledge the
declaration of 128-bit REALs, nor provide any way of implementing 128-bit
arithmetic.
[It is worth noting parenthetically that our competition frequently does
worse in this regard than we. Some languages support REAL*16 declarations,
but never store anything other than zeroes in the least significant half!]
Migration of user programs, whether they come from our current installed
base or from elsewhere, now presents a couple of problems:
1) User programs that use D-float will run slower than with G-float. Simple
replacement of D with G is recommended but will not work in certain
situations:
o Calling routines (that expect F-float) with D-float arguments
works on VAX since D is an extension of F; will fail with G-float
o G-float has 3 fewer fraction bits than D-; the loss of precision
may be unacceptable.
o Reading D-files with F-programs may work, but not with G-files.
2) Codes using 128-bit arithmetic will not even compile. If the VAX-Alpha
binary translators are used, the resulting code will run substantially
slower on Alpha than on, e.g., 8650's. Replacement of 128-bit by G-format
will frequently fail due to loss of precision.
Some alternative plans to offer some kind of 128-bit support on Alpha have
been suggested. None of those that require the Alpha languages to accept
128-bit declarations could be implemented in the near time frame, when the
majority of VAX-Alpha migrations are likely to take place. In any event the
implementation will be slow. It seems most efficient to forego H-float
entirely in favor of double-G (as opposed to extended G) format, since
algorithms already exist to take advantage of the fact that each half is a
viable Float.
We foresee a substantial effort in migrating existing programs from other
architectures to Alpha (or even, in the short term, to VAX 9000's), for
those applications from the Scientific community that require high
precision. This is a critical application space for DEC. It is time now to
start assembling the tools, methods, and skills to overcome the problem.
The problem is already well-defined; solutions will come with training and
determination. We can already do the following:
1) Locate programs that use D- and H-float arithmetic and begin migrating
them to G-float where feasible. Perform regression tests to investigate the
effects of different precision calculations. Make performance tests to
evaluate the cost difference of running the programs.
2) In the event this process fails, or produces higher-cost solutions,
evaluate the market impact of this failure and escalate the issue to upper
management.
|