Sampling from a given Boltzmann distribution using a D-Wave computer
I am currently researching Restricted Boltzmann Machines (RBMs), a kind of stochastic neural network that requires sampling from a Boltzmann distribution with energy a QUBO to be trained and used. In the past, multiple publications have shown that D-Wave machines can be used to do so, in order to avoid the exponential complexity associated with a bruteforce calculation of the probabilities for the exponential number of states.
However, as I am trying to implement D-Wave based RBMs myself, I am struggling with some problems and would love to have some insight on them. I already had a look at the documentation, yet I still have some questions.
1. There is a difference between the mathematical QUBO I have, that I define in Python, and the physical QUBO that is actually implemented. Such a difference must exist for multiple reasons:
- A priori, the QUBO I define can have arbitrary big coefficients, while a physical system cannot have arbitrary big energies. Thus, my QUBO must be scaled at some point.
- The QUBO written in Python is physically dimensionless, while an actual energy has a physical dimension (ML^2T^-2). Thus, an energy scale must be involved.
- The physical Boltzmann distribution depends on temperature through a factor 1/(kB*T).
This can be summarized by saying that denoting by H the QUBO defined in Python, what the machine implements and samples from is some \beta_eff*H. While having a positive \beta_eff ensures that the ground states remain the same and therefore allows for the discovery of the ground state of my mathematical QUBO, the Boltzmann distribution is changed by this factor. How can I prevent this from happening (i.e. set \beta_eff to 1), or efficiently deduce my target Boltzmann distribution from the sampled Boltzmann distribution ? I am kind of skeptical regarding the second option, as there are as many probabilities than states, which is an exponential number, and thus such a deduction would be at the cost of an exponentially-sized calculus.
2. Getting an estimation of a probability distribution through a statistical sampling implies having enough samples to reach convergence in the law of large numbers. As a first approximation, if N is the amount of qubits involved in my QUBO, then there are 2^N possible states and thus 100*2^N is a decent estimation of the amount of samples required to reach the aforementioned convergence. Now, when trying to sample from a distribution, I encountered an error stating that num_reads is at maximum 10 000, which (according ti the previous estimation) allows for no more than 6 qubits, which is not enough for me. I thought of simply make multiple batches of 10 000 reads and concatenate them together with Python ; however, this forum thread implies that it won't be a solution. I've read the D-Wave documentation mentioned in the answer, but I still do not really understand the problem. Moreover, I cannot mathematically approach the difference there is between making, as the author said, one call at num_reads=1000 and 10 calls at num_reads=100; this is a problem to me, as I cannot, for example, still make multiple calls at num_reads=10000 and then use some mathematical transformation on samplesets before concatenating them to get my desired statistical distribution. So, why is this phenomenon, what is its impact from a mathematical point, and is there a way to counter it ? If not, how can I use a D-Wave computer to get the statistic distribution I am interested in ?
Thanks in advance for your help