Annealers : Lowest Energy Not the Best Solution
This is just to quickly share something I experienced.
I am solving through QUBO problem a linear system : Mx=y with M ( non-singular matrix ) and y ( vector ) known and we are looking for x ( vector ). Let's write x_e the exact solution, x_l the vector corresponding to the lowest energy state and x_b the closets vector to the exact solution x_e in the current precision ( best solution possible ).
So we want to minimise g(x)=||x-x_e||. The objective function chosen is f(x)=||Mx-y||².
On paper, we have equivalence between minimising g and f because M in not singular. f is minimsed for x=x_e and its minimal value is 0.
Therefore, if x-components have an exact binary representation, we get x_e=x_b=x_l. Indeed, the closest vector to x_e is x_e because it as an exact binary representation in the current precision so x_b=x_e and the lowest f value possible ( so the lowest energy possible ) is 0, reached for x=x_e so x_l=x_e.
But now, let's assume that at least one the x-components has not an exact binary representation in the current precision. It means that the lowest value that f can get is a certain value over 0 strictly and is reached for x_l != x_e.
But we could assume that x_l=x_b. Indeed, we could thought that since x_l minimises f, it minimises g too and since x_g minimises g it minimises f too. Said otherwise, the best solution for f is the best for g so minimising f minimises g too and return the best solution possible to approximate x_e in the current precision. So we can be quite surprised when we check the best solution possible and the lowest in energy one and that there are not the same. Or when the lowest energy solution is not as close to x_e as it would be if it was the best, or even when it gets out of the computation new interval when we make them dynamic.
Indeed, there is no trivial equivalence between minimising f and minimising g if x_e can't be picked up through sampling. It depends a lot on the matrix.
For instance :
M = [ -65.6 266.4 ]
[ -266.4 1066.6 ]
y = [ 499.6 1999.9 ]
We have
x_e = [ 0.1 1.9 ]
There is no exact representation of this vector if we set the bit-precision up to 4 bits.
Therefore, the best solution possible with this precision is
x_b = [ 0.125 1.875 ].
Bu when we run the algorithm, the lowest energy solution is not x_b, it is
x_l = [ 0 1.875 ].
That can be surprising because
g(x_b) = ||x_b-x_e|| = 0.035 < 0.103 = ||x_l-x_e|| = g(x_l)
So we could think that the lowest in energy ( so the vector which minimises f ) should be x_b in this precision. But a simple calculation shows that we have
f(x_l) = ||Mx_l-y||² = (-0.1)²+(-0.025)² = 0.011 < 1,179.446 = (-8.3)²+33.325² = ||Mx_b-y||² = f(x_b).
Here we see well that x_l minimises f so it is the lowest energy solution but that it doesn't minimise g so it is not the best solution possible, and vice-versa for x_b : it minimises g so it is the closest solution to the exact solution x_e but that it doesn't minimise f so it is not the lowest in energy one.
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/////////////////////////////
\\\\\\\\\\\\\\\\\SUMMARY HERE///////////////////
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/////////////////////////////
To put it in a nutshell, we saw that to minimise g we try to minimise f. But in practice the lowest energy solution picked up by the quantum annealer is not the best solution ( closest vector to the exact solution ).
In fact, here, the issue is not really quantum related, it is more a mathematical issue. Each time we try to find the solution of Mx=y by minimising f, we get the same issue, no matter if the algorithm is quantum-computer purposed or not. Indeed, the matrix M chosen is ill-conditioned so it get x_l far from x_b ( in term of euclidean distance from the exact solution x_e ). Maybe y can have an influence too.
When we want to solve Ax=y, if we turn it into a minimisation problem, it is not minimising f(x)=||Mx-y||² but first minimising g(x)=||x-x_e||. In a continuous space, both are equivalent : minimising f minimises g and vice-versa. But in a discrete space, as we have in practice, it is not. Surely conditioning M should dive the error made by approximating x_b by x_l ( to check ). But the value of x_l seems quite random when it comes to compare it to x_b or x_e as it depends a lot on M.
By the way, we clearly see here the importance of choosing a ``good'' objective function to minimise calculations of the coefficients for weights and interactions and to mininmise the error of taking the lowest in energy solution as the best one despite it is not often the case. Hopefully, the distance between them seems to decrease when the bit precision increase ( to check ) but it takes definitely more qubits than what we have nowadays. So we have to deal with this error made. That can be tricky because x_l can pop out out the computation interval so we need to find a code to at least detect error.
Maybe a function can be found that minimises g when it gets minimised. But that doesn't seem trivial.
Comments
Please sign in to leave a comment.