			Arbitrary precision numerical algorithms

*INTRO This chapter documents some numerical algorithms used in Yacas for exact
integer calculations as well as for multiple precision floating-point
calculations, gives brief descriptions of the non-trivial algorithms and
estimates of the computational cost. Most of the algorithms are taken from referenced literature; the remaining algorithms were developed by us.


		Basic arithmetic

Currently, Yacas uses either internal math (the {yacasnumbers} library) or the
GNU multiple precision library {gmp}. The algorithms for basic arithmetic in
the internal math mode are currently rather slow compared with {gmp}. If $P$ is
the number of digits of precision, then multiplication and division take
$M(P)=O(P^2)$ operations in the internal math. (Of course, multiplication and division by a short integer takes time linear in $P$.) Much faster algorithms for long multiplication
(Karatsuba / Toom-Cook / FFT, Newton-Raphson division etc.) are
implemented in {gmp} where at large precision $M(P)=O(P*Ln(P))$. In the
computation cost estimations of this chapter we shall assume that $M(P)$ is at
least linear in $P$. 

Warning: calculations with internal math with precision exceeding 10,000 digits are currently impractically slow.

In some algorithms it is necessary to compute the integer parts of expressions such as $a*Ln(b)/Ln(10)$ or $a*Ln(10)/Ln(2)$ where $a$, $b$ are short integers of order $O(P)$. Such expressions are frequently needed to estimate the number of terms in the Taylor series or similar parameters of the algorithms. In these cases, it is important that the result is not underestimated but it would be wasteful to compute $Ln(10)/Ln(2)$ in floating point only to discard most of that information by taking the integer part of say $1000*Ln(10)/Ln(2)$. It is more efficient to approximate such constants from above by short rational numbers, for example, $Ln(10)/Ln(2) < 28738/8651$ and $Ln(2) < 7050/10171$. The error of such an approximation will be small enough for practical purposes. The function {NearRational} can be used to find optimal rational approximations. The function {IntLog} (see below) efficiently computes the integer part of a logarithm in integer base. If more precision is desired in calculating $Ln(a)/Ln(b)$ for integer $a$, $b$, one can compute $IntLog(a^k,b)$ for some integer $k$ and then divide by $k$.

		Prime numbers

Prime numbers are tested using the Miller-Rabin algorithm.

There are also a function {NextPrime(n)} that returns the smallest prime number larger than {n}. This function uses a sequence 5,7,11,13,... generated by the function {NextPseudoPrime} that contains numbers not divisible by 2 or 3 (but perhaps divisible by 5,7,...). {NextPseudoPrime} is very fast because it does not test for prime numbers.

		Factorization of integers

Factorization of integers is implemented by functions {Factor} and {Factors}. Both functions try to find all prime factors of a given integer $n$. (Before doing this, the primality checking algorithm is used to detect whether $n$ is a prime number.)
Factorization consists of repeatedly finding a factor, i.e. an integer $f$ such that $Mod(n, f)=0$, and dividing $n$ by $f$.

For small prime factors the trial division algorithm is used: $n$ is divided by all prime numbers $p<=257$ until a factor is found. {NextPseudoPrime} is used to generate the sequence of candidate divisors $p$.

After separating small prime factors, we test whether the number $n$ is an integer power of a prime number, i.e. whether $n=p^s$ for some prime number $p$ and an integer $s>=1$. This is tested by the following algorithm. We already know that $n$ is not prime and that $n$ does not contain any small prime factors up to 257. Therefore if $n=p^s$, then $p>257$ and $2<=s<s[0]=Ln(n)/Ln(257)$. In other words, we only need to look for powers not greater than $s[0]$. This number can be approximated by the "integer logarithm" of $n$ in base 257 (routine {IntLog (n, 257)}).

Now we need to check whether $n$ is of the form $p^s$ for $s=2$, 3, ..., $s[0]$. Note that if for example $n=p^24$ for some $p$, then the square root of $n$ will already be an integer, $n^(1/2)=p^12$. Therefore it is enough to test whether $n^(1/s)$ is an integer for all <i>prime</i> values of $s$ up to $s[0]$, and then we will definitely discover whether $n$ is a power of some other integer.
The testing is performed using the integer square root function {IntNthRoot} which quickly computes the integer part of $n$-th root of an integer number. If we discover that $n$ has an integer root $p$ of order $s$, we have to check that $p$ itself is a prime power (we use the same algorithm recursively). The number $n$ is a prime power if and only if $p$ is itself a prime power. If we find no integer roots of orders $s<=s[0]$, then $n$ is not a prime power.

If the number $n$ is not a prime power, the Pollard "rho" algorithm is applied (J. Pollard, <i>Monte Carlo methods for index computation mod p</i>, Mathematics of Computation, volume 32, pages
918-924, 1978). The Pollard "rho" algorithm takes an irreducible polynomial, e.g. $p(x)=x^2+1$ and builds a sequence of integers $x[k+1]:=Mod(p(x[k]),n)$, starting from $x[0]=2$. For each $k$, the value $x[2*k]-x[k]$ is attempted as possibly containing a common factor with $n$. The GCD of $x[2*k]-x[k]$ with $n$ is computed, and if $Gcd(x[2*k]-x[k],n)>1$, then that GCD value divides $n$.

The Pollard "rho" algorithm may enter an infinite loop when the sequence $x[k]$ repeats itself without giving any factors of $n$. For example, the unmodified "rho" algorithm loops on the number 703. The loop is detected by comparing $x[2*k]$ and $x[k]$. When these two quantities become equal to each other for the first time, the loop may not yet have occurred so the value of GCD is set to 1 and the sequence is continued. But when the equality of $x[2*k]$ and $x[k]$ occurs many times, it indicates that the algorithm has entered a loop. A solution is to randomly choose a different starting number $x[0]$ when a loop occurs and try factoring again, and keep trying new random starting numbers between 1 and $n$ until a non-looping sequence is found. The current implementation stops after 100 restart attempts and prints an error message, "failed to factorize number".

A better (and faster) integer factoring algorithm is needed.

		Adaptive plotting

The adaptive plotting routine {Plot2D'adaptive} uses a simple algorithm
to select the optimal grid to approximate a function $f(x)$. The same algorithm
for adaptive grid refinement could be used for numerical integration. The
idea is that plotting and numerical integration require the same kind of
detailed knowledge about the behavior of the function.

The algorithm first splits the interval into a specified initial number of
equal subintervals, and then repeatedly splits each subinterval in half
until the function is well enough approximated by the resulting grid. The
integer parameter {depth} gives the maximum number of binary splittings for
a given initial interval; thus, at most $2^depth$ additional grid points
will be generated. The function {Plot2D'adaptive} should return a list of
pairs of points {{{x1,y1}, {x2,y2}, ...}} to be used directly for plotting.

The recursive bisection algorithm goes like this:

*	 1.  Given an interval ($a$, $c$), we split
it in half, $b:=(a+c)/2$ and first compute $f(x)$ at five grid
points $a$, $a[1]:=(a+b)/2$, $b$, $b[1]:=(b+c)/2$, $c$. 
*	 2. If currently $depth <= 0$, return this list of 5 points and
values because we cannot refine the grid any more.
*	 3. Otherwise check that the function does not change sign too
rapidly on the interval. The formal criterion is that among these 5 points there are always
at least 2 consecutive points of equal sign. Checking is done by
the following procedure: Mark the sequence of signs of the values
$f(x)$ at the 5 grid points, e.g. "0, +, -, +, +". Here "0" stands
for zero. Then, for each pair of consecutive signs, write "1" if
the signs are different and "0" otherwise. If one of the signs is
0, then also write "0". E.g. we get the sequence of 4 bits "0, 1,
1, 0". Each "1" stands for a sign change. Then, for each pair of
consecutive bits, write the logical AND of these bits. E.g.: 0, 1,
0. Each "1" now signifies that two sign changes occurred, one
right after another. If we have all "0" now, then the sign changes
are "slow enough". Otherwise they are not "slow enough". We can
compute the logical OR of these 3 bits to test this. In our
example we get 1. This means that we have a sign change that is
too rapid.
If the sign is not changing "slow enough" within the interval
[a,c], then we need to refine the grid; go to step 5. Otherwise,
go to step 4.
*	 4. Check that the function values are smooth enough through the
interval. Smoothness is controlled by a parameter $epsilon$. The
meaning of the parameter $epsilon$ is the (relative) error of the
numerical approximation of the integral of $f(x)$ by the grid. A good heuristic
value of $epsilon$ is 1/(the number of pixels on the screen)
because it means that no pixels will be missing in the area under
the graph. However, for this to work we need to make sure that we are actually computing the area <i>under</i> the graph; so we define $g(x):=f(x)-f[0]$ where $f[0]$ is the minimum of the values of $f(x)$ on the five grid points $a$, $a[1]$, $b$, $b[1]$, and $c$; the function $g(x)$ is nonnegative and has minimum value of 0. Then we compute two different Newton-Cotes quadratures
for $ Integrate(x,b,b[1]) g(x) $ using these five points. (Asymmetric
quadratures are chosen to avoid running into an accidental symmetry of the
function; the first quadrature uses points $a$, $a[1]$, $b$, $b[1]$ and the second
quadrature uses $b$, $b[1]$, $c$.) If the
absolute value of the difference between these quadratures is less
than $epsilon$ * (value of the second quadrature), then we
are done and we return the list of these five points and values.
*	 5. Otherwise we need to refine the grid. We compute
{Plot2D'adaptive} recursively for the two halves of the interval,
i.e. for ($a$,$b$) and ($b$,$c$); we pass the midpoint values as
necessary and decrease {depth} by 1. We also multiply $epsilon$ by 2 because we need to maintain constant <i>absolute</i> precision and this means that the relative error for the two subintervals can be twice as large. The resulting two lists for the two subintervals are
concatenated (excluding the double value at point $b$) and
returned.

This algorithm works well if the initial number of points and the {depth}
parameter are large enough.

Singularities in the function are handled by the step 3. Namely, the algorithm checks whether the function returns a non-number (e.g. {Infinity}) and if so, the sign change is always considered to be "too rapid". Thus, the intervals immediately adjacent to the singularity will be plotted at the highest allowed refinement level. When plotting the resulting data, the singular points are simply not printed the data file and the plotting programs do not have any problems.

The meaning of Newton-Cotes quadrature coefficients is that an integral is approximated as

$$(Integrate(x,a[0],a[n]) f(x)) <=> h*Sum(k,0,n,c[k]*f(a[k]))$$,
where $h:=a[1]-a[0]$ is the grid step, $a[k]$ are the grid points, and
$c[k]$ are the quadrature coefficients. These coefficients are independent
of the function $f(x)$ and can be precomputed
in advance for any grid $a[k]$ (not necessarily with constant step
$h=a[k]-a[k-1]$).
The Newton-Cotes coefficients $c[k]$ for
grids with a constant step $h$ can be found, for example, by solving a system of equations,
$$Sum(k, 0, n, c[k]*k^p) = n^(p+1)/(p+1)$$
for $p=0$, 1, ..., $n$. This system of equations means that the coefficients $c[k]$ correctly approximate the integrals of functions $f(x)=x^p$ over the interval (0,$n$).

The solution of this system always exists and gives quadrature coefficients as rational numbers. For example, the Simpson quadrature $c[0]=1/6$, $c[1]=2/3$, $c[2]=1/6$ is obtained with $n=2$.

In the same way it is possible to find quadratures for the integral over a subinterval rather than over the whole interval of $x$. In the current implementation of the adaptive plotting algorithm, two quadratures are used: the 3-point quadrature ($n=2$) and the 4-point quadrature ($n=3$) for the integral over the first subinterval, $Integrate(x,a[0],a[1]) f(x)$. Their coefficients are ($5/12$, $2/3$, $-1/12$) and ($3/8$, $19/24$, $-5/24$, $1/24$). 

*INCLUDE algorithms-elemfunc.chapt
*INCLUDE algorithms-specfunc.chapt


			Symbolic algebra algorithms

*INCLUDE algorithms-multivar.chapt

*INCLUDE algorithms-integration.chapt
