# Sampling from a given Boltzmann distribution using a D-Wave computer

Hello everyone,

I am currently researching Restricted Boltzmann Machines (RBMs), a kind of stochastic neural network that requires sampling from a Boltzmann distribution with energy a QUBO to be trained and used. In the past, multiple publications have shown that D-Wave machines can be used to do so, in order to avoid the exponential complexity associated with a bruteforce calculation of the probabilities for the exponential number of states.
However, as I am trying to implement D-Wave based RBMs myself, I am struggling with some problems and would love to have some insight on them. I already had a look at the documentation, yet I still have some questions.

1. There is a difference between the mathematical QUBO I have, that I define in Python, and the physical QUBO that is actually implemented. Such a difference must exist for multiple reasons:
- A priori, the QUBO I define can have arbitrary big coefficients, while a physical system cannot have arbitrary big energies. Thus, my QUBO must be scaled at some point.
- The QUBO written in Python is physically dimensionless, while an actual energy has a physical dimension (ML^2T^-2). Thus, an energy scale must be involved.
- The physical Boltzmann distribution depends on temperature through a factor 1/(kB*T).
This can be summarized by saying that denoting by H the QUBO defined in Python, what the machine implements and samples from is some \beta_eff*H. While having a positive \beta_eff ensures that the ground states remain the same and therefore allows for the discovery of the ground state of my mathematical QUBO, the Boltzmann distribution is changed by this factor. How can I prevent this from happening (i.e. set \beta_eff to 1), or efficiently deduce my target Boltzmann distribution from the sampled Boltzmann distribution ? I am kind of skeptical regarding the second option, as there are as many probabilities than states, which is an exponential number, and thus such a deduction would be at the cost of an exponentially-sized calculus.

2. Getting an estimation of a probability distribution through a statistical sampling implies having enough samples to reach convergence in the law of large numbers. As a first approximation, if N is the amount of qubits involved in my QUBO, then there are 2^N possible states and thus 100*2^N is a decent estimation of the amount of samples required to reach the aforementioned convergence. Now, when trying to sample from a distribution, I encountered an error stating that num_reads is at maximum 10 000, which (according ti the previous estimation) allows for no more than 6 qubits, which is not enough for me. I thought of simply make multiple batches of 10 000 reads and concatenate them together with Python ; however, this forum thread implies that it won't be a solution. I've read the D-Wave documentation mentioned in the answer, but I still do not really understand the problem. Moreover, I cannot mathematically approach the difference there is between making, as the author said, one call at num_reads=1000 and 10 calls at num_reads=100; this is a problem to me, as I cannot, for example, still make multiple calls at num_reads=10000 and then use some mathematical transformation on samplesets before concatenating them to get my desired statistical distribution. So, why is this phenomenon, what is its impact from a mathematical point, and is there a way to counter it ? If not, how can I use a D-Wave computer to get the statistic distribution I am interested in ?

• Hello,

It is indeed difficult to get an exact measurement for the temperature value of the system by calculating the it directly, however you could try using the method outlined in this paper and find an upper and lower bound for the temperature value:
https://arxiv.org/pdf/1606.00919.pdf

The basic idea is to sample a system with one qubit to determine a lower bound for the temperature value, and then sample a configuration with a larger block of strongly coupled qubits to determine the upper bound.
This is referred to a single qubit freeze-out in the paper.
If the larger block of qubits are strongly coupled together they will all take on the same value, which means that there are two possible states: all up or all down. This simplifies sampling considerably. In addition, all higher energy states can basically be ignored, as their probability will be close to zero, due to the strong coupling of qubits.

Using this method you could obtain data such as that available in the annealing schedules available on this page:
https://support.dwavesys.com/hc/en-us/articles/360005267253-QPU-Specific-Anneal-Schedules

These will help to better understand the energies and units of the system, as you had inquired about earlier.

https://docs.dwavesys.com/docs/latest/doc_physical_properties.html

This documentation might also be useful if you haven't yet come across it:
https://docs.dwavesys.com/docs/latest/c_qpu_1.html?highlight=ice

It outlines sources of error inherent to the QPU.

I hope this information was helpful.
Please let us know if you have any questions.

• Hello,
Thanks for your reply. I will look into those resources and let you know if it solved my problem.

Have a nice day,
TP