| I will try to discuss here the `precision' problem. So if you are
interested only in a direct answer to your question , I doubt you can find it
here.
As you know a floating point number is made of two parts: a mantissa
and an exponent.
The mantissa in the set of 'significative' digits that represent the number.
Depending on the BASE chosen ( 2, 10,..) there are numbers ( called
`rational') that can be exactly represented only by a fraction.
E.g. 1/3. You may not represent this number in any other way, unless you
choose a base that make it `integer' ( 3 in this case, or any
power of 3 . Then (1/3)decimal = (0.1)base 3).
Some numbers may be easily represented in Base 10 and cannot be represented in
base 2 (eg. 0.1 decimal ).
A system is `precise' if the representation of the number is close to the real
value up to the very last digit. So it is `accurate' depending on the number
of digits that are present in the mantissa. Consequently:
85.000 = all the numbers between 84.4995 and 85.0005
84.499 = all the numbers between 84.4985 and 84.4995
if you introduce a rounding factor of .5 on the last (not shown) digit.
But if you truncate the number ( which normally happens) then
84.499 = all the numbers between 84.499+delta to 85.000 (delta small at will).
In numeric computation you have to chose. Either you keep track of how
accurate and precise your computation is, by extending the number of digits
( using single or double precision on different operation ) , or you
transform everything into `integer' and work that way.In this case you
must compute the exponent by itself. The last method is a lot faster in
computing and easy to implement. For a good discussion of it, please refer
to Leo Brodie's book " Introduction to Forth" in which he discusses
whether it is necessary to introduce floating point in Forth or not.
Summing up: any conversion may lose a bit of information, hence do as few
conversions as you can !!
|
| Thanks Marco, I tried integer computation, but still didn't get the
level of precision I needed.
After organizing the upgrade documentation to the Manx C compiler, I
found that compiling with "+fi" option and linking in the mt.lib provides
IEEE double floating precision (8 bytes), about 16 signifigant digits.
It may be slow, but this is a calculator simulation, so it's fine for this
purpose.
Thanks again,
John O.
|