Last active
June 19, 2021 21:06
-
-
Save rygorous/f5e11f6589088f42dddd2711582afb1e to your computer and use it in GitHub Desktop.
Single-precision FMA using double-precision mul/add ops
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We agreed beforehand that we don't care about tininess-before-or-after-rounding shenanigans. :) | |
With that out of the way, the idea is to do something like this: | |
fma(af, bf, cf): | |
# All arithmetic done with RN | |
ad = float_to_double(af) | |
bd = float_to_double(bf) | |
cd = float_to_double(cf) | |
# exact: double has sufficient exponent range and mantissa bits | |
prod = mul_double(ad, bd) | |
# not necessarily exact! | |
sum0 = add_double(prod, cd) | |
# round-to-odd fixup | |
# ultimately we just need to determine whether the sum above was | |
# exact or not | |
fixup0 = sub_double(sum0, prod) | |
if False: | |
# VERSION 1 (would be nicer, but alas... it doesn't work) | |
# take prod = 1.0, cd = 2^54 for a counterexample | |
fixup1 = cmpneq_double(fixup0, cd) | |
else: | |
# VERSION 2 (more work but this one I'm pretty confident about) | |
# this is the rest of two-sum(prod, cd) | |
err1 = sub_double(sum0, fixup0) | |
err2 = sub_double(prod, err1) | |
err3 = sub_double(cd, fixup0) | |
err4 = add_double(err2, err3) | |
# err4 is now the exact error in the sum above | |
# check whether it's nonzero | |
fixup1 = cmpneq_double(err4, 0.0) | |
fixup2 = bitwise_and(fixup1, 1) | |
sum1 = bitwise_or(sum0, fixup2) | |
# round result to float | |
# round-to-odd above makes double rounding give the correct result | |
return convert_to_float(sum1) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment