## NP-complete problems and tautologies

Many NP complete problems can be encoded in system of polynomial equations. For example, in partition problem one is asking whether it is possible to divide a list of integers into two lists that the sum of element of each least are equal. Given the list of elements $a \in \mathbb{Z}^n$ one is looking for the element $x \in \{ \pm 1 \}^n$, such that $\sum \limits_{k=1}^n x_k a_k =0$.

To encode the problem into the system of polynomial equations over $\mathbb{C}$ one need $n$ equations of the form $x_k^2-1=0$ – that will ensure variables $x_k=\pm 1$, and the above mentioned equation $\sum \limits_{k=1}^n x_k a_k =0$. In total $n+1$ nonlinear equations. We can also change the variables that $x'_k= \frac{x_k+1}{2} \in \{0,1\}$. In some sense this variables may be easier for the analyses later.

One way to solve this equation is to linearise it, e.g. we define a new set of variables $y_{(k,m)}= x'_k x'_m,\ k,m=0..n$, and we set $x'_0=1$. Now in terms of $y_{(k,m)}$ we have system of linear equations. Unfortunately, we have lost relationship between terms. Linear system does not “know” that $y_{(2,3)}= y_2 y_3$, etc. On the other hand, if solution of the original system exists it should lay in the null space of the coefficient matrix. We will try use it. Suppose, we have null space $U=[u_1,..., u_K] \in \mathbb{C}^{ \frac{(n+1)(n+2)}{2} \times K }$, than if system of equations has the solution all monomials will be simultaneously expressed as a linear combination of null space vectors $y_{(k,m)}(c)= \sum \limits_{i=1}^K c_i u_{i,(k,m)}$. The dimensionality of null space $K$ is exactly the number of quadratic monomials minus number of equations.

So far, we were looking only at second order monomials. If we go to higher order monomials we can see that $x_{k_1} x_{k_2} x_{k_3} x_{k_4}= (x_{k_1} x_{k_2}) (x_{k_3} x_{k_4})=(x_{k_1} x_{k_3}) (x_{k_2} x_{k_4})=(x_{k_1} x_{k_4})( x_{k_2} x_{k_3})$. What we wrote is tautology – this is true independent of the values of $x_k$. This is also true for any $k_1,...,k_4$. So this tautologies can be written in terms of our new variables $y_{(k_1,k_2)}y_{(k_3,k_4)}-y_{(k_1,k_3)}y_{(k_2,k_4)}=0$, $y_{(k_1,k_2)}y_{(k_3,k_4)}-y_{(k_1,k_4)}y_{(k_2,k_3)}=0$. On the other hand, if we have $k_m = k_n \forall m \neq n$, than we does not have tautology. In any case, we can write down all tautologies for up to order 4, and their expression in terms of new variables $y$. Moreover, for each $y_{k,m}$ we have its linear expression in terms of unknowns $c_i$. So if we expand tautologies in terms of $c_i$ we will get a new set of quadratic equations, that must be satisfied, if original system has a solution.

Now we already know what to do. We need to linearise the system of quadratic equations, for which we need to introduce new variables $b_{k,m}= c_k c_m$, write down the system of linear equations, find the null space of coefficient matrix, encode $b_{m,k}$ as a linear combination of the null space basis, and finally encode forth-order monomials in terms of new coefficient of new null space basis. Ok, we can repeat this forever, just remember to write down all quadratic tautologies in terms of known monomials (e.g. $(x_1x_2x_3x_4)(x_5x_6x_7x_8)= (x_1x_2x_3x_5)(x_4x_6x_7x_8)=...$ many more tautologies). What is interesting here is how $K$ changes with iterations.

Ok, I will come back here later, when I cool down and wil have more time. Note that the number of tautologies are much more (seems to be exponentially more) than the number of monomials.

Each iteration we are squaring the number of variables which encode monomials in terms of null space. Now consider only monomials with $k_m \neq k_n, \forall m \neq n$. If we have monomial of degree $2k$ than only for that monomial we have $\frac{1}{2}\binom{2k}{k}$, since we want to choose half variables, and one half comes from the fact that multiplication is commutative. The number of monomials is $\binom{n}{2k}$. So, assuming tautologies are linearly independent, we have $K_0=n^2$, $K_{m+1}=K_m^2-\frac{1}{2}\binom{2m}{m}\binom{n}{2m} =$ $K_m^2-\frac{1}{2}\binom{n-m}{m}\binom{n}{m} \approx$ $K_m^2-\frac{1}{2} n^{2m}/m!$. So the lower bound is asymptotically …

Many NP-complete problems can be encoded in the system of polynomial equations. Some examples are in the introduction here. Basically the idea is to encode binary variables in the equations $f_k= x_k^2-1= 0$, or $f_k = x_k^2-x_k=0$, and add one or more quadratic equations ($g_k=0$) encoding decision problem. Since the system can have exponentially many solutions finding them is hard. On the other hand, we can try to figure out if the system has solutions at all. One of the approaches is to use Nulstellensatz certificate, e.g. find the set of polynomials $P_k, P'_k \in \mathbb{K}[x_1,...x_n]$, where $\mathbb{K}$ is an algebraically closed field, for example $\mathbb{C}$ such that the following equation is satisfied $1= \sum_k P_k f_k +\sum_m P'_m g_m$. The problem with this approach is that one need very high degree polynomials $P_k, P'_k$ and the certificate has exponentially many monomials. The intuition here is following. The problem is hard since all variables are influencing other variables, otherwise the problem can be split into independent subproblems. Since all of them are important the term with monomial $x_1 ... x_n$ has to appear in the certificate. That lead to the certificate of degree $n$, which has too many monomials. There is no clear way of selecting right monomials, and we need to use all of them. So the problem is that by increasing the degree of certificate the “mixing” of variables is too slow!!! And we need a way to speed-up mixing of the variables.
Here it is. We do usual trick of linearizing quadratic equations in terms of quadratic monomials (for 3-SAT we need cubic monomials), e.g. $y_{x_k,x_m}= x_k x_m,\ k, m = 0..n,\ x_0=1$. This linear system has $\approx n^2$ linear variables (monomials), and $\approx n$ equations, e.g. highly under-complete. The solution of this (homogenised) linear system is $\approx n^2$ dimensional projective space. The original system has a solution iff there is a point $c \in \mathbb{C}^{\approx n^2}$ in this projective space such that all corresponding equations on $y_{x_k,x_m} = y_{x_0,x_k} y_{x_0, x_m}, y_{0,0}=1$ are satisfied. These are quadratic equations on $c$, so we can write all the quadratic in $c$ constraints. For example, $y_{x_k,x_m} y_{x_p,x_q}= y_{x_k,x_p} y_{x_m,x_q}= y_{x_k,x_q} y_{x_p,x_m}$ (So we are using tautologies, instead of deriving them as done in propositional proof systems). Therefore, we have $\approx \frac{2}{3} n^4$ quadratic equations with $\approx n^2$ indeterminates $c_k$. Now we have equivalent, in terms of satisfiability, system of quadratic equations with $O(n^4)$ equations and $O(n^2)$ variables, and before we had $O(n)$ quadratic equations with $O(n)$ variables. In other words, now we have $O(1) > C$ equations in terms of the number of monomials, and before we had $o(1)$ equations in terms of monomials.
Now we can write Nulstellensatz certificate for the new system. The number of independent coefficients grow faster then the number of monomials with increasing the certificate degree. Therefore, there is a fixed (independent of $n$) degree of certificate that depend on $C$. If the above arguments are correct, it would be possible to check whether NP-complete problem has a solution in polynomial time, with all consequencies.