Arb mulhigh #1802

albinahlback · 2024-02-26T19:27:43Z

I don't think we should have different precisions on different systems (currently flint_mpn_mulhigh is only available on some x86-64 systems), but this is my very preliminary draft on how I want arb_mul to look like.

Left to fix:

Make flint_mpn_mulhigh available on all systems.
Implement _arb_mul_special
Make sure that the mag is has the correct bound -- arf_mul_rnd_sloppy rounds input in order to be able to perform a n-by-n high multiplication. So one has to account for rounding before the multiplication is done as well as the fact that the multiplication is inexact.
Discuss if we should account for the precision for the normal case (i.e. when arf_mul_rnd_sloppy is used) -- should we round the result or not?

fredrik-johansson · 2024-02-27T16:37:49Z

Yeah, I wouldn't do this before having a generic C fallback for and fixing the performance issues for flint_mpn_mulhigh / flint_mpn_mulhigh_normalised, so that these functions can be relied upon without conditional directives creeping into higher level code.

It will certainly not work to just take the shorter of the input lengths as the precision in arf_mul_rnd_sloppy. In general, the inputs can be shorter or longer than the precision, or a mixture. A bit of logic is needed for this. It will even be optimal to zero-pad operands in some cases.

Note that it is also necessary to normalise the arf_t output by removing trailing zero limbs.

fredrik-johansson · 2024-02-28T10:14:54Z

I think this work would be aided by having a general flint_mpn_mulhigh which computes the top n (+ 1) limbs of an m x p product, m + p >= n (+ 1).

Such a method would be a bit complex, but it would offload complexity from the arf/arb side.

albinahlback · 2024-02-28T10:24:13Z

In general, isn't it true that an arb_t is on the form $x = a 2^{m} \pm b 2^{n}$ with $a, b, m, n \in \mathbb{Z}$ and $|b 2^{n}| > 2^{m}$? Or am I thinking wrong that only calculating an balanced high multiplication for unbalanced multiplicants will not give a so-much-bigger error bound?

tthsqe12 · 2024-05-25T01:36:04Z

mulhi? I agree arbs would like this, but Newton's method likes mulmid more.
I have been experimenting with the function

ulong mpn_mulmid_approx(ulong* x, ulong xlo, ulong xhi, const ulong* a, ulong an, const ulong* b, ulong bn);

which requires space for all an+bn limbs in x but is only responsible for writing [xlo,xhi). If the return is e, then the true middle limbs lie somewhere inclusively between the produced limbs and the produced limbs + e. Usually the return is 0.

fredrik-johansson · 2024-05-25T11:21:04Z

mulhi? I agree arbs would like this, but Newton's method likes mulmid more. I have been experimenting with the function
ulong mpn_mulmid_approx(ulong* x, ulong xlo, ulong xhi, const ulong* a, ulong an, const ulong* b, ulong bn);
which requires space for all an+bn limbs in x but is only responsible for writing [xlo,xhi). If the return is e, then the true middle limbs lie somewhere inclusively between the produced limbs and the produced limbs + e. Usually the return is 0.

Yep, that should improve the division and square root code quite a bit.

mulhigh is already doing good things for nfloat though. I think I want to iron out all the basic algorithms there before trying to improve arf/arb.

albinahlback · 2024-05-25T12:38:38Z

mulhi? I agree arbs would like this, but Newton's method likes mulmid more. I have been experimenting with the function
ulong mpn_mulmid_approx(ulong* x, ulong xlo, ulong xhi, const ulong* a, ulong an, const ulong* b, ulong bn);
which requires space for all an+bn limbs in x but is only responsible for writing [xlo,xhi). If the return is e, then the true middle limbs lie somewhere inclusively between the produced limbs and the produced limbs + e. Usually the return is 0.

Depends on what Newton's method you are talking about. I believe that precomputed inverses and reciprocal square roots via Newton's method really favors mulhi.

tthsqe12 · 2024-05-29T03:19:54Z

mulhi is a special case of mulmid, and some of your mulhis for inversion don't need all of the high bits. It is even in the TODO.md:

`nmod_poly`

Implement fast mulmid and use to improve Newton iteration

fmpzs are just nmod_polys with carries, so the same principles apply.

fredrik-johansson · 2024-05-29T07:35:36Z

You can see this in

flint/src/gr_poly/inv_series_newton.c

Line 57 in 80e9b24

/* should be mulmid */

where the low m coefficients of the mullow output are never used.

(Of course high/low Newton divison are the same but reversed.)

tthsqe12 · 2024-07-16T19:47:07Z

For beyond 20M bits, I am seeing a ~30% improvement in arb_inv with this mulmid over what is currently in flint. (Still have to tune it in the medium range.)

fredrik-johansson · 2024-07-20T12:10:57Z

For beyond 20M bits, I am seeing a ~30% improvement in arb_inv with this mulmid over what is currently in flint. (Still have to tune it in the medium range.)

That sounds excellent.

Of course, the current arb code is a placeholder implementation using arf arithmetic. For medium precision, having everything in mpn form should be a bit better.

albinahlback added 5 commits February 26, 2024 18:04

Add preprocessor constants if mpn_mulhigh exists

8e44235

Add ARB_IS_SPECIAL macro

919249f

Add arf_mul_rnd_sloppy

d575cdf

Change arf_mul_rnd_sloppy

c952c43

Draft on new arb_mul

a8baba0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arb mulhigh #1802

Arb mulhigh #1802

albinahlback commented Feb 26, 2024

fredrik-johansson commented Feb 27, 2024

fredrik-johansson commented Feb 28, 2024

albinahlback commented Feb 28, 2024 •

edited

Loading

tthsqe12 commented May 25, 2024

fredrik-johansson commented May 25, 2024

albinahlback commented May 25, 2024

tthsqe12 commented May 29, 2024

fredrik-johansson commented May 29, 2024

tthsqe12 commented Jul 16, 2024 •

edited

Loading

fredrik-johansson commented Jul 20, 2024

Arb mulhigh #1802

Are you sure you want to change the base?

Arb mulhigh #1802

Conversation

albinahlback commented Feb 26, 2024

fredrik-johansson commented Feb 27, 2024

fredrik-johansson commented Feb 28, 2024

albinahlback commented Feb 28, 2024 • edited Loading

tthsqe12 commented May 25, 2024

fredrik-johansson commented May 25, 2024

albinahlback commented May 25, 2024

tthsqe12 commented May 29, 2024

nmod_poly

fredrik-johansson commented May 29, 2024

tthsqe12 commented Jul 16, 2024 • edited Loading

fredrik-johansson commented Jul 20, 2024

albinahlback commented Feb 28, 2024 •

edited

Loading

`nmod_poly`

tthsqe12 commented Jul 16, 2024 •

edited

Loading