# Limits of Simulated Annealing Sampler

I am experimenting with the D-Wave quantum annealer and benchmarking it against the classical solution in terms of runtime. I was also benchmarking the DWave sampler against the Neal SimulatedAnnealingSampler and noticed that its runtimes seem to be constant as a function of the problem size and the results seem to be mostly correct.

So I am wondering what the limits of the simolator are. Does it work as a reliable source of results only up to a certain problem size? Does it have a memory requirement that grows exponentially as a function of the problem size?

## Comments

Thomas P(Report)One of the Ocean tools developers could probably provide more insight into the specifics of the latest neal implementation.

For simulated annealing, in general, time complexity has much more impact than space complexity. Time complexity is not constant; I would expect it to be super-linear with the size of the problem. I have not kept up with specific algorithmic complexities, so I cannot comment on the state of the art.

In contrast, physical quantum annealing does indeed have a constant time complexity. (Caveat: There are details regarding the asymptotic-to-ideal annealing schedule and associated time complexity, but those details are mostly dwarfed by latency, time in queue, overhead, etc.) The most significant issues for quantum annealing on the D-Wave versus using the simulated annealer are topological constraints and the constraint of stoquastic Hamiltonians. (Both of these are being addressed, faster than I had expected, which is quite exciting.)

Let me know if this helps.

Tom

Roberts M(Report)Hey Tom,

thank you. That's some good information.

What I am really interested in is the practical applications for solving e.g. the structural imbalance in social networks problem. I was benchmarking the classical, exact solver against the quantum annealer and found the expected roughly constant and roughly exponential time complexity (in regards to wall clock time on my end). But I was quite surprised when I threw in the simulated annealer to find that it actually solved some larger problems (social network with 30-50 nodes) in quite a short amount of time and that time seemed to be far from exponential as a function of the problem size. I also gave it some test runs with the factoring problem to check if it actually produces correct results or just random noise. I was quite surprised to see it actually gave correct result for problems too large for the exact solver to solve before I ran out of patience.

So now I'm wondering how do the exact solver, simulated annealing, and physical annealing compare to each other when it comes to practical applications. On the surface simulated annealing seems to outperform the classical solution. And I don't know where the limit to the simulator's capabilities is. For gate model quantum computers I know that the simulation has exponential space complexity and the largest supercomputers can simulate around 50 qubit of gate model QC. But annealing obviously behaves differently and I don't know where the limiting factor making actual physical annealing necessary is.

Thomas P(Report)The nature of the data can make a big difference. Simulated annealing is probabilistic; in principle, you get to decide the tradeoff between speed and precision (or correctness). If you visualize the problem space as a landscape with hills and valleys, then simulated annealing (or any annealing, really) is essentially bouncing around at random, trying to settle into the lowest point in the terrain.

If your problem space has known characteristics, then you can make simplifying assumptions that can be exploited by a more specific algorithm. If your problem space is completely unexplored, then the correctness of a simulated annealing solution will depend on (1) the entropy (or randomness) of your problem space, (2) how long you are willing to wait, and (3) luck. For many problems, simulated annealing can be good enough.

Finding exact solutions to large, complex optimization problems using classical methods is essentially a computational dead end. The solution space for a large optimization problem quickly gets out of hand, and searching through all possible solutions becomes intractable.

One of the physicists can correct me if I am wrong, but the goodness of quantum annealing comes from the fact that a quantum system can exhibit tunneling behavior in an attempt to settle into a lower energy state. Whereas simulated annealing would need to "bounce" its way over the hills to find the minima, a quantum annealer can sort of teleport right through the hills. To optimally exploit this goodness on the D-Wave you need to be cognizant of the topology, coupling limits, and fit your problem formulation so it's a good match to the hardware. This fitting is done through an embedding process. The current toolchain does an okay job at this, certainly good enough for many experiments, but the results are not necessarily optimal. There is active research on this topic and I am certain we will see advances in the embedding tools. For now, though, if optimality is required, hand-fitting the problem to the hardware topology is the way to go.

Cathy M(Report)HI, I'm on the benchmarking team at D-Wave. Two reasons you might see no exponential increase in computation time with n for any given classical solver are:

(1) Maybe the inputs are too easy to force exponential growth -- NP hard problem classes contain lots of easy instances. For example, Factoring is famously hard, but more than half of all numbers can be factored in constant time (i.e. the numbers ending in 0,2,4,6,8, and 5). Just because the problem class is hard in the worst case doesn't mean the inputs you chose for testing are necessarily very hard.

(2) Most classical solvers have a bunch of parameters, with at least one being the ``work'' parameter -- for SA, it is the number of iterations in the anneal. As a general rule, the higher the work parameter, the better solutions. For a proper scaling analysis, the work parameter needs to be set for each n: if you use a fixed value that is too high for small problems, scaling will appear flat; if you use a fixed value that is too low for the big problems, scaling will appear exponential. Neither of these necessarily matches the true scaling of the algorithm, they're just artifacts of how you set up the experiment.

Tt is a hard research problem to design an experiment that tells you something fundamental about the algorithms involved rather than just reflecting your experimental design parameters. This holds for both quantum and classical. Don't let that discourage you, though, just be aware of the challenges.

Cathy M(Report)I guess another thing to point out is that, as you have observed, a heuristic like SA can be extremely fast on some types of inputs, and it will not work at all on other types of inputs (never converges to optimal). The research question is to figure out what combination of input properties and parameter settings are associated with good or bad performance.

Please sign in to leave a comment.