## On the integer factorization and Nullstellensatz linear algebra

In integer factorization we are interested in finding the nontrivial divisor of the positive integer $n$. Suppose we have factors $p$ and $q$. We are interested in representing factors in binary form – $x= \sum \limits_{k=0}^{\lceil \log_2 n \rceil -1 } 2^k x_k, y= \sum \limits_{k=0}^{\lceil \log_2 n \rceil -1 } 2^k y_k$, than we are interested in solving system of equations:

(1) $\left\{ \begin{array}{r} x_1(x_1-1)=0 \\ ... \\ x_K(x_K-1)=0 \\ y_1(y_1-1)=0 \\ ... \\ y_K(y_K-1)=0 \\ \left( \sum \limits_{k=0}^{K} 2^k x_k \right ) \left( \sum \limits_{m=0}^{ M } 2^m y_m \right) -n =0 \end{array} \right.$

We assume $x_k, y_k \in \mathbb{C}$. Each equation except the last one has exactly two solutions – $\{0,1\}$. The last equation (given the constraint of the first equation) is encoding product of binary representation of integer numbers $x, y$ with K and M bits respectively. Therefore, the ideal of the system of polynomial equations (1) is either empty set, or the sets of points representing the possible factorization (and thus zero-dimensional). For example, it has two solution if both $p$ and $q$ are primes – $x=p,y=q$ and $x=q, y=p$. We also can assume that both $p$ and $q$ are odd primes, than $x_0=1$ and $y_0=1$.

Suppose, that we can determine the feasibility of the system (1) in a reasonable time. Then finding factors would be equal to testing recurrently whether the system is feasible with the K-th bit in x representation is equal to one or zero, and then testing next bit in feasible configuration, until all bits are determined. That would require $\log n$ testings.

Now we are interested in finding the fast algorithm for testing feasibility of system

(2) $\left\{ \begin{array}{r} x_1(x_1-1)=0 \\ ... \\ x_K(x_K-1)=0 \\ y_1(y_1-1)=0 \\ ... \\ y_K(y_K-1)=0 \\ \left( 1 + \sum \limits_{k=1}^{K} 2^k x_k \right ) \left(1+ \sum \limits_{m=1}^{ M } 2^m y_m \right) -n =0 \end{array} \right. , K+M= \lceil \log_2 n \rceil .$

Here, we will apply Nullstellensatz linear algebra (J.A. De Loera et al. / Journal of Symbolic Computation 46 (2011) 1260–1283) algorithm for testing the feasibility of the system of polynomial equations (2). The algorithm basically is based on Hilbert’s Nullstellensatz, in particular, a system of polynomial equations $f_1(x)=0, ..., f_s(x)= 0$, where $f_i \in \mathbb{K}[x_1, ..., x_m]$, and $\mathbb{K}$ is algebraically closed field has no solution in $\mathbb{K}^m$ if and only if there exist polynomials $\beta_1, ..., \beta_s \in \mathbb{K}[x_1, ... , x_m]$ such that $1= \sum \beta_i f_i$. Moreover, $1= \sum \beta_i f_i$ is called Nullstellensatz certificate. The idea behind Nullstellensatz linear algebra algorithm is to fix the degree of the polynomials $\beta_i(x)$ to $d$ and write the polynomials in the general way – $\beta_i(x)= \sum c_{i,k} x^{\alpha_k}$, where $\alpha_k \in \mathbb{N}_0^m, c_{i,k} \in \mathbb{C}$, $x^{\alpha}= \prod x_i^{\alpha_i}$, and the sum is taken over all monomials up to degree d. Then expanding the Nulsstellensatz certificate and equating coefficients in the left and right side one is looking to solve the system of linear equations on the coefficients.

For example, suppose K=M=1, then we have the following system

$\left\{ \begin{array}{r} x(x-1)=0 \\ y(y-1)=0 \\ \left( 1 + 2 x \right ) \left(1+ 2 y \right) -n =0 \end{array} \right.$

Now suppose we fix the degree of $\beta_1, \beta_2, \beta_3$ to 2, so the final polynomial is of order 4. In general $\beta_k(x,y)= c_{k,1} x^2+ c_{k,2} xy+ c_{k,3} y^2 + c_{k,4} x + c_{k,5} y + c_{k,6}$, than Nullstellensatz certificate is

$1= (c_{1,1} x^2+ c_{1,2} xy+ c_{1,3} y^2 + c_{1,4} x + c_{1,5} y + c_{1,6} )(x^2-x) +$ $(c_{2,1} x^2+ c_{2,2} xy+ c_{2,3} y^2 + c_{2,4} x + c_{2,5} y + c_{2,6} )(y^2-y) +$ $(c_{3,1} x^2+ c_{3,2} xy+ c_{3,3} y^2 + c_{3,4} x + c_{3,5} y + c_{3,6} )(4xy+2x+2y+1-n) =$

$c_{1,1} x^4 + ( c_{1,2} + 4 c_{3,1} ) x^3 y + ( c_{1,3} + c_{2,1} + 4 c_{3,2} ) x^2 y^2 + ( c_{2,2} + 4 c_{3,3} ) x y^3 + c_{2,3} y^4 + ( - c_{1,1} + c_{1,4} + 2 c_{3,1} ) x^3 + ( -c_{1,2}+ c_{1,5} - c_{2,1} + 2 c_{3,1} + 2 c_{3,2} + 4 c_{3, 4} ) x^2 y + ( -c_{1,3} - c_{2,2} + c_{2,4} +2 c_{3,3} + 2 c_{3, 2 } + 4 c_{3,5} ) x y^2 + ( c_{2,5} - c_{2,3} + 2 c_{3,3} ) y^3 + ( c_{1,6} -c_{1,4} + c_{3,1} + 2 c_{3,4} ) x^2 + ( -c_{1,5} -c_{2,4} + c_{3,2} (1-n) + 4 c_{3,6} + 2 c_{3,5} + 2 c_{3,4} ) xy + ( c_{2,6} - c_{2,5} + 2 c_{3,5} + c_{3,3} ) y^2 + ( -c_{1,6} + 2 c_{3,6} + c_{3,4} (1-n) ) x + ( -c_{2,6} + 2 c_{3,6} + c_{3,5} (1-n) ) y + c_{3,6} (1-n)$.

which defines 15 equations in 18 variables.

$\left\{ \begin{array}{r} c_{1,1}=0 \\ c_{1,2} + 4 c_{3,1}=0 \\ c_{1,3} + c_{2,1} + 4 c_{3,2}= 0 \\ c_{2,2} + 4 c_{3,3}= 0 \\ c_{2,3}= 0 \\ - c_{1,1} + c_{1,4} + 2 c_{3,1} = 0 \\ -c_{1,2}+ c_{1,5} - c_{2,1} + 2 c_{3,1} + 2 c_{3,2} + 4 c_{3, 4}=0 \\ -c_{1,3} - c_{2,2} + c_{2,4} +2 c_{3,3} + 2 c_{3, 2 } + 4 c_{3,5}= 0 \\ c_{2,5} - c_{2,3} + 2 c_{3,3}= 0 \\ c_{1,6} -c_{1,4} + c_{3,1} + 2 c_{3,4} =0 \\ -c_{1,5} -c_{2,4} + c_{3,2} (1-n) + 4 c_{3,6} + 2 c_{3,5} + 2 c_{3,4} = 0 \\ c_{2,6} - c_{2,5} + 2 c_{3,5} + c_{3,3} = 0 \\ -c_{1,6} + 2 c_{3,6} + c_{3,4} (1-n)=0 \\ -c_{2,6} + 2 c_{3,6} + c_{3,5} (1-n) =0 \\ c_{3,6} (1-n)= 1 \end{array} \right.$

This system has infinitely many solutions unless n=1,3,9, which are the composite numbers represented by the 2-bit factors. The explicit certificate for equivalent system

$\left\{ \begin{array}{r} x(x-2)=0 \\ y(y-2)=0 \\ \left( 1 + x \right ) \left(1+ y \right) -n =0 \end{array} \right.$

is

$1= \left[ \left( \frac{9}{4(n-3)(n-9)} - \frac{1}{4(n-1)(n-3)} \right) y^2 + \frac{1}{(n-1)(n-3)} \right] x( x-2 ) +$ $\left[ -\left( \frac{1}{4(n-1)(n-3)}+ \frac{3}{4(n-3)(n-9)} \right) x^2 + \frac{6}{(n-3)(n-9)} x + \frac{1}{(n-1)(n-3)}\right] y (y-2) +$ $\left[ \left( \frac{1}{2(n-1)(n-3)}- \frac{3}{2(n-3)(n-9)} \right) xy -\frac{1}{(n-1)(n-3)} x - \frac{1}{(n-1)(n-3)} y -\frac{1}{n-1} \right] [(x+1)(y+1)-n]$.

We do not know the degree of the polynomials $\beta$ in advance. On the other hand, we can construct lower and upped bound. The upper bound can be computed using Corollary of Lazard lemma (p 1264 in De Loera paper)

Corollary 2.5. Given polynomials $f_1, ... , f_s \in \mathbb{K}[x_1, ... , x_n]$ where $\mathbb{K}$ is an algebraically closed field and d = max{ deg( $f_i$) }, if $f_1, ... , f_s$ have no common zeros and $f_1, ... , f_s$ have no common zeros at infinity, then $1 = \sum \limits^s_{ i=1} \beta_i f_i$ where deg($\beta_i$) ≤ n(d − 1).

The system (1) has no common zeros if infisible, and also has no common zeros at infinity, d=2, therefore deg$(\beta_i) \leq \lceil \log n \rceil$, which lead to the linear system in $(\lceil \log n \rceil +1 ) \left( \begin{array}{c} 2 \lceil \log n \rceil \\ \lceil \log n \rceil \end{array} \right) \approx n^2 \sqrt{\lceil \log n \rceil}$ coefficients.

On the other hand, the lower bound can be computed by equating the number of monomials in the certificate and the number of coefficients. Let (M+K)=m be the total number of variables. The number of coefficients is equal to the number of monomials times the number of equations in system (1), e.g. for degree d – $\left( \begin{array}{c} m+d \\ m \end{array} \right) (m+1)$, on the other hand the number of monomials in Nullstellensatz certificate correspond to the number of monomials for the polynomial with degree d+2, e.g $\left( \begin{array}{c} m+d+2 \\ m \end{array} \right)$. Than d should satisfy

$0= (m+1) \left( \begin{array}{c} m+d \\ m \end{array} \right)- \left( \begin{array}{c} m+d+2 \\ m \end{array} \right) = \frac{(m+1) (m+d)!}{m!d!} - \frac{(m+d+2)!}{m!(d+2)!}=$ $\frac{(m+d)!}{m!(d+2)!} ( m(d+1)(d+2) - (m+d+1)(m+d) ) \Rightarrow d^2+d -m- 1=0$ $\Rightarrow d= \sqrt{ \frac{5}{4}+m} - \frac{1}{2} \approx \sqrt{m}$. Therefore, the minimal number of monomials is $(m+1) \left( \begin{array}{c} m+\sqrt{m} \\ m \end{array} \right) \approx \sqrt{m} 2^{ \sqrt{m} }$, which is subexponential, and comparable to Quadratic sieve performance. To finish the work one needs to prove that the system of equations for coefficients is solvable.