Freezing script when using EmbeddingComposite

Oleksii Z

June 22, 2023 23:06

Hi there!

Dear community, I need help in a second step of investigating quantum computing for me.

I am trying to use DWAVE for middle-size problems since I had some success in a small toy problem :)

Now trying to solve

Ax = b

without any constraints, but when A has shape 500x900 and b has 500 size.

So actually trying to find a solution for 900 binary variables.

It works fine (with some success) using SimulatedAnnealingSampler, but when I try to use EmbeddingComposite sampler - running the script just freezes on the line ..

sampleset = sampler.sample(bqm = bqm,...

No error messages, just waiting.. and nothing happens.

when I run the same part of code but using SimulatedAnnealingSampler - everything works, but I don't find solution I am looking for even close.

I wanted to run it using QPU to see and compare speed, try to read about 100 times to get better results. But it freezes even with the small num of reads.

What am I doing wrong or where should I look to resolve this issue?

Here is a code to reproduce the issue:

import numpy as np
import time
import os
import json
import pandas as pd
import matplotlib.pyplot as plt
import dimod
from dwave.system import LeapHybridCQMSampler
from neal import SimulatedAnnealingSampler
from dwave.system import DWaveSampler, EmbeddingComposite
import dwave.inspector

# write your token to run on dwave..
token='DEV-your-code-here'

# cleaning the screen
os.system('clear')

# objective function for direct calculation
# chi2 for (Ax-b)
def obj_f(A, b, x):
    return (np.dot(x.T, np.dot(A.T, np.dot(A, x))) - 2 * np.dot(b.T, np.dot(A, x)) + np.dot(b.T, b))

# the same with the term using sum of x
# L(x, λ) = (x'A'Ax - 2b'Ax + b'b) + λ(sum(x) - K)^2
def obj_f_constrained(A, b, x, K, Lagrange):
    return (np.dot(x.T, np.dot(A.T, np.dot(A, x))) - 2 * np.dot(b.T, np.dot(A, x)) + np.dot(b.T, b) + Lagrange * (np.sum(x) - K)**2)

def plot_data_sol_cand(title, A, b, x_true, x_cand, mode=0):

    x = np.arange(len(b))
    fig, ax = plt.subplots()
    ax.plot(x, b, label='b = data')
    obj_true = obj_f(A, b, x_true)
    obj_cand = obj_f(A, b, x_cand)
    true_sol_res = np.dot(A, x_true)
    cand_sol_res = np.dot(A, x_cand)
    ax.plot(x, true_sol_res, linewidth = 2, label='true_sol, of = '+str(obj_true))
    ax.plot(x, cand_sol_res, linewidth = 2, label='cand_sol, of = '+str(obj_cand))
    if (mode==1):
        # showing each candidate separately
        cand_non_zero_indexes = np.nonzero(x_cand)[0]
        for i in range(0, len(cand_non_zero_indexes)):
            sec_y = A[:, cand_non_zero_indexes[i]]
            label = '$ A_{'+str(cand_non_zero_indexes[i])+'}$'
            ax.plot(x, sec_y, linewidth=0.7, linestyle='--', color='red', alpha=0.6, label = label)
    ax.set_title(title)
    ax.set_xlabel('Index')
    ax.set_ylabel('$ data, \sigma_{true} =  Ax_{true}, \sigma_{sol} = Ax_{cand} $')
    ax.grid(True)
    ax.legend()
    plt.show()
    return ax

###
### Input data to play with
###
A_link = 'https://drive.google.com/uc?export=download&id=1RcO5ge_8fEI7zBQlXQ-ucrDX8OqhSTEU'
b_link='https://drive.google.com/uc?export=download&id=1eEGXRj80YqjtEZohsTZj_Wp1NbMWZ-NO'
#A_df = pd.read_excel('./docs/oziv_ax_b/2_data_a_b/A.xlsx', header=None)
A_df = pd.read_excel(A_link, header=None)
#print(A_df)
print()
# to numpy
A = A_df.values
print('A')
print(A.shape)
print(A.nbytes / (1024 * 1024), ' Mb')

#b_df = pd.read_excel('./docs/oziv_ax_b/2_data_a_b/b.xlsx', header=None)
b_df = pd.read_excel(b_link, header=None)
print(b_df)
print()
b = b_df.values
print('b')
print(b.shape)
print(b.nbytes / (1024 * 1024), ' Mb')

#reshaping
b = b.flatten()
print(b)
print(b.shape)
print(b.nbytes / (1024 * 1024), ' Mb')

input1 = input("Press Enter to continue... ")

# true solution
true_sol_indexes = [288, 463, 809]
x_true = np.zeros(A.shape[1])
x_true[true_sol_indexes] = 1
x_cand = np.zeros(A.shape[1])
true_sol = np.dot(A, x_true)

###
### Solving unconstrained problem
###
print('Unconstrained problem...')
print()

print('Calculating Q, c, c_free...')
# calculate quadratic biases
Q = np.dot(A.T, A)
# print('Q')
# print(Q.shape)
# print(Q)
# calculate linear biases
c = -2 * np.dot(A.T, b)
# print('c')
# print(c.shape)
# print(c)
# constant that can be neglected but for clearness we need to add it
c_free = np.dot(b.T, b)
# print('c_free')
# print(c_free.shape)
# print(c_free)
num_reads = int(input('Enter the number of reads: '))
# create binary quadratic model for the objective function
print('Creating a BQM model...')
bqm = dimod.BinaryQuadraticModel(c, Q, c_free, dimod.BINARY)
input1 = input("Press Enter to continue... ")

# if we want to simulate
# sampler = SimulatedAnnealingSampler()

# using EmbeddingComposite
sampler = EmbeddingComposite(DWaveSampler(token = token))
print('Using sampler:', type(sampler).__name__)

print('Sampler parameters:')
print(sampler.parameters)
input1 = input("Press Enter to continue... ")
start_time = time.time()
# sample
sampleset = sampler.sample(bqm = bqm,
                           num_reads=num_reads,
                           chain_strength=10,
                           label = 'QUBO Unconstrained opt., chi2 with constant'
                           )
end_time = time.time()
calc_time = end_time - start_time

#dwave.inspector.show(sampleset)

print('calc time, s:', calc_time)
input1 = input("Press Enter to continue... ")
print()
print('Set of solutions:')
print(sampleset)
print()
print('Sampleset record:')
print(sampleset.record)
print()
input1 = input("Press Enter to continue... ")
print()
print('Sampleset data:')
print(sampleset.data)
print()
input1 = input("Press Enter to continue... ")
print()
print('Info:')
print(json.dumps(sampleset.info, indent=4))
print('Calc time (dep. on Sampler):')
if (type(sampler).__name__ == 'SimulatedAnnealingSampler'):
    print(calc_time, 's')
elif (type(sampler).__name__ =='EmbeddingComposite'):
    print("QPU access time:", sampleset.info['timing']['qpu_access_time'], "microseconds")
print()
print('Taking the first sample as a solution:')
x = sampleset.first
print('x is: ')
print(x)
x_dict = sampleset.first.sample
x_unc_np = np.array(list(x_dict.values()))
print()
print()
print('obj_f value for best solution: ', obj_f(A, b, x_unc_np))
print()
print('Solution:')
print(x_unc_np)
nonzero_cnt = np.count_nonzero(x_unc_np)
non_zero_indexes = np.nonzero(x_unc_np)
print(f'Solution has {nonzero_cnt} components:')
print(non_zero_indexes)
input1 = input("Plotting to compare... ")
plot_data_sol_cand(f'Unconstrained opt results, reads = {num_reads}, t = {calc_time} s', A, b, x_true, x_unc_np, mode = 1)
print()

name = input("Press Enter to proceed with constrained solve? ")

Comments

4 comments

Oleksii Z (Report)
June 26, 2023 22:37 (Edited June 26, 2023 22:38)

Maybe the problem is because of the problem sizing? so it just can't be embedded because of the size?

when I tried to run the same problem (or even added constraints, so in this case the size of the problem is bigger) but used hybrid solver - it works... the problem with 900 binary variables seems to big?

where should I look to solve that issue?

0

Comment actions Permalink
David J (Report)
June 29, 2023 05:05

Hello,

It sounds like it might be the case that the problem is too large to be embedded.

The embedding step can take quite some time and might run for a long time.

Do you get an error message of some kind?

You can also try running the find_embedding function directly to see how long it takes.
See this post for an example of how to do this.

If you are able to run the problem on HSS but not on QPU, you could be running into size limitations.

0

Comment actions Permalink
Oleksii Z (Report)
June 29, 2023 20:08
Yes, I have modified the code and it seems the problem is too big..

Here is what I have got for my data & bqm.

part of code:
...
import networkx as nx
import dwave_networkx as dnx

G = dimod.to_networkx_graph(bqm=bqm) #bqm.to_networkx_graph()

# using EmbeddingComposite
qpu = DWaveSampler(token = token)

# Create a target Pegasus graph (for D-Wave Advantage system)
T = dnx.pegasus_graph(16)

# size of the G & T respectively
# Number of nodes in G

num_nodes_G = G.number_of_nodes()
print(f'Number of nodes in G: {num_nodes_G}')
# Number of edges in G
num_edges_G = G.number_of_edges()
print(f'Number of edges in G: {num_edges_G}')
# Number of nodes in T
num_nodes_T = T.number_of_nodes()
print(f'Number of nodes in T: {num_nodes_T}')
# Number of edges in T
num_edges_T = T.number_of_edges()
print(f'Number of edges in T: {num_edges_T}')

input1 = input("Press Enter to continue... ")

embedding = minorminer.find_embedding(G.edges(), T.edges())
# embedding = find_clique_embedding(bqm.variables, qpu.to_networkx_graph())

num_qubits = sum(len(chain) for chain in embedding.values())
print('Approx. num of qubits:', num_qubits)
The problem is that it freezes on the line
```
embedding = minorminer.find_embedding(G.edges(), T.edges())
```
Before that I have output

Unconstrained problem...
Calculating Q, c, c_free...
Enter the number of reads: 10
Creating a BQM model...
Number of variables in BQM: 900
Number of interactions in BQM: 404550
Press Enter to continue...
Backend TkAgg is interactive backend. Turning interactive mode on.
Number of nodes in G: 900
Number of edges in G: 404550
Number of nodes in T: 5640
Number of edges in T: 40484

How to estimate if the problem is to large for solving on the QPU (even for embedding)?

Am I correct that now size of the problem is too large & I need to make it smaller?

Or - just use the Hybrid Solver for large-sized problems? What are the recommendations in this case?

I have tried to use CQM for the same data added some simple constraints, and it worked.

How in this case ConstrainedQuadraticModel & LeapHybridCQMSampler solve this problem? It seems
it breaks it into small parts and some of the parts are solved traditional way (using cpu & GPU) and some of the parts on the QPU? Because I see that it uses QPU small amount of time... Are there any requirements on the size of the problem, e.g. when you have created your QUBO/bqm?
0

Comment actions Permalink
David J (Report)
July 01, 2023 01:47

For information on solver-specific information you can refer to the properties and parameters section of the documentation. In your case you will want to look at the problem parameters section to see how many variables, constraints, etc. that are available for a given solver.
This is more relevant for HSS solvers, as with a QPU it will depend on the degree of interaction between variables.

The interaction between variables is what drives the embedding step. Since the degree to which variables interact is problem-dependent, it is difficult to say what the maximum number of variables available on a given QPU solver for a given problem type is.

You can get a lower bound, which would be the maximum fully connected problem that can be embedded on a given QPU solver. For this, you can use the DWaveCliqueSampler as a quick shortcut. You can check the largest_clique_size property for this information.

For the absolute maximum problem size, you can look at the number of qubits available on a given QPU, but the problem type will be dependent to the geometry of the QPU itself, so we would assume that no embedding happens but that the problem is the exact same shape as the QPU.

This is rarely the case, but to get a better idea of the QPU shape and what embedding looks like, check out the topologies and embedding sections of the D-Wave system documentation.

Unfortunately the inner workings of the HSS solvers is proprietary so we can not share anything about their inner workings. They do break the problem down and allocate quantum resources, which is described at a high level in our this Knowlege Base article alongside some information about what kinds of problems each HSS solver is used for.

For the HSS solvers, they are not restricted to a specific hardware topology in the same way as the QPUs, so the problem formulation can be more flexible, and the degree of connectedness and number of variables does not play as large of a role. In short, you do not need to embed against a specific topology with the HSS solvers.

In your case the problem is too big for the QPU solvers.
You can also go to the Leap Dashboard, scroll down to the Solvers section and click on the solver you are interested in. Solver properties, such as maximum number of biases and constraints are available in the summary modal.

Hopefully we have addressed all of your questions, but if you need any clarification or have follow-up inquiries, please feel free to post again and we will try to point you in the right direction.

0

Comment actions Permalink

Please sign in to leave a comment.

Didn't find what you were looking for?

New post