Troubleshooting poor results

Mike Z

June 11, 2019 00:53

Hi,

I am attempting to reproduce a Job-shop Scheduling Problem from a paper linked to in the Problem-Solving Handbook, https://arxiv.org/pdf/1506.08479.pdf

To start, I am using the small 3 x 3 instance defined in Figure 1-a. I have implemented the constraints as they are described in Section II, but I have not done any variable pruning (Section II-B) or any other refinements (Section III). These are your standard JSP constraints: each operation starts once and only once, there can only be one operation running on a machine at any time, and operations of a job cannot start until their preceding operations have completed.

To verify that the CSP was correct, I used the simulated annealing sampler, which returned a valid solution for every 1 in 2 anneals. I moved on to using the quantum annealer, and have not yet received a single valid solution to the CSP. The chain break fraction for the samples was typically within 0.5 - 0.6, so I tried increasing the chain_strength, and while that did decrease the average chain break fraction, all of the results were still invalid. I've also tried modifying some other parameters, such as the min_classical_gap, anneal_schedule, and anneal_offset, and still no dice.

Does anyone have advice on determining why I am not getting good results with the QA? Maybe there is some documentation on the debugging poor results that I haven't seen yet?

Comments

10 comments

David J (Report)
June 11, 2019 01:26 (Edited March 31, 2023 05:01)

Sometimes longer chains will freeze out before other qubits, effectively biasing their results.
One area with the anneal_offset that you can look at is shifting the anneal schedule of these longer chains to a later time.

Extending the anneal time might also help to give better results.

Also, pruning variables can drastically improve the results in some cases.

Take for example the case where coupler biases of zero are embedded onto the QPU.

In this case, potentially long chains are embedded onto the QPU, the problem biases are scaled down and take less effect, and many of the qubits might not be in use, even though they are embedded, because their coupler strength is zero.

When this happens, it is better to just include the qubits and couplers relevant to the problem in the input dictionary (h & j values).

There is also a sample job shop scheduler that one of our employees has been working on:
https://github.com/dwave-examples/job-shop-scheduling

This is a bit of a tricky problem to formulate, and I am not sure how well the problem runs just yet, but you might want to take a look at it for inspiration at the least.

I hope this was helpful for now, and I will try to update with more information as it becomes available.

0

Comment actions Permalink
Florian H (Report)
June 11, 2019 09:07

The Github link is not working. Maybe the repository is private. Could you provide another link / change access rights?

0

Comment actions Permalink
Alejo C (Report)
June 11, 2019 12:57

I think they moved it to the demos repository, this link has worked for me:

https://github.com/dwave-examples/job-shop-scheduling

2

Comment actions Permalink
Mike Z (Report)
June 11, 2019 15:30

David,

Regarding the coupler and qubit biases, I was under the impression that manually setting these values becomes impracticable with size. Is this something that is typically done programmatically for problems?

I also did some testing on my JSP formulation as well as the one you linked (I had the same issue as Florian, but Alejo's link worked - Thanks Alejo). They look similar, but I haven't gone that much in depth. One difference I can spot is it appears the linked JSP does some variable pruning. This results in the csp having 7 fewer logical variables, 10 fewer ancillary variables, and 117 fewer constraints than my implementation.

I just did a bit of testing with the anneal_offsets. I specified negative offset values and only used qubits whose offset_range could support the offset. For the arbitrarily picked offset values of -0.05, -0.1, and -0.12 (beyond which finding a valid embedding became difficult) the mean chain break fraction of 5000 samples each did decrease, from .59 to .52 to .50. There was no appreciable change in the mean energy. As I mentioned, I can't really go to a lower offset than -.12, so I was wondering if these were the improvements expected by changing the anneal offset?

I also tried the JSP in the repo and observed similar performance to my implementation. The min and mean energies (6 and 12) and chain_break_fractions (.26 and .34) were indeed lower, but still no valid schedule. However, using the SimulatedAnnealingSampler, I frequently got valid schedules.

I'm planning to get to variable pruning in the next couple of days and see how much of an improvement there is.

For setting the anneal schedule, are there any guidelines for choosing times, pauses, etc?

0

Comment actions Permalink
David J (Report)
June 12, 2019 02:38 (Edited March 31, 2023 05:14)

Unfortunately there are no general guidelines you can follow to get better results.
Each problem is slightly different and will require some tailoring to optimize it on the QPU.

Generally speaking, longer chains should be annealed later, and a longer anneal time will yield better results.
Also, having as few chains as possible, and having chains of equal length is preferable.
The last two points are a little harder to do programatically, but the first two points could compensate at least in part.

Here is a link to a community post with some related information:
https://support.dwavesys.com/hc/en-us/community/posts/360017676493

Pruning the variables will also yield much better results, as there will be unused variables that will increase overall chain length in order for all of the variables to embed properly.

One way to prune variables is to systematically remove impossible schedules.
In some cases, steps need to happen in a sequence, so ruling out sequences that are out of order would be one way of pruning.
This might be possible to do using time constraints, such as not allowing a later step to appear at the beginning of the schedule.

It might be a good idea to look at which constraints are always breaking and weight that constraint heavier.

Sometimes some formulations of the problem are more effective than others as well, due to energy distributions.
It might be a good idea to consider trying alternative or additional constraints to the ones you are currently using.

This is actually a hard problem to formulate for the QPU, since the schedule is highly restrictive, and requires a specific "one-hot" solution.

This paper talks about scheduling and one-hot problems:

https://www.dwavesys.com/media/gq0j44o5/25_partition-of-large-optimization-problems-with-one-hot-constraint.pdf

It might be interesting and useful. Part of their approach was to break the problem up into subproblems.

Another option is to use combinations in addition to dwave-hybrid.
dimod.generators.constraints.combinations()
Here is a link to combinations:

https://docs.ocean.dwavesys.com/en/stable/docs_dimod/reference/generated/dimod.generators.combinations.html

Here is a link to dwave-hybrid:
https://support.dwavesys.com/hc/en-us/community/posts/360029426154

The reason this would be effective is that combinations doesn't limit the qubit and coupler biases to the range of the QPU, in turn creating less auxiliary variables.

This is a new idea that has not been tested, however, and the results might vary.

In this case you would only want to use dwave-hybrid with the decomposer/composer and qpu solver.

You could use the tabu solver on smaller problems as a proof of concept, but since it is a classical computer solver, it will slow down dramatically on larger problems.

I would suggest moving forward with solutions on the QPU as you are doing, tweaking the anneal schedule, and pruning the variables where possible first before trying the dwave-hybrid solution.

I will update you with any additional information as it becomes available.

Thank you for your patience!

I hope this was helpful!

Feel free to reach out with any questions or for clarification.

0

Comment actions Permalink
Florian H (Report)
June 28, 2019 13:18

Hey Mike

I just found this paper:
https://arxiv.org/abs/1810.05881

They could heavily increase the quality of their results by inserting a pause in the schedule. (see Figure 5 and 6)

You mentioned that you already changed the anneal schedule but maybe not in the same way those guys did.

Maybe you can try something similar with your problem.

1

Comment actions Permalink
Mike Z (Report)
July 01, 2019 01:39

Thanks for the link.

I've made quite a bit of progress and now that I'm getting good results, I plan to compare the different parameters again. I'll add pausing to the list.

By the end of this week I plan to update with the effects that various pruning and parameters have had on my results, I'm curious if it matches up with expectations.

1

Comment actions Permalink
Mike Z (Report)
July 09, 2019 12:03

So as I mentioned, I've made a bit of progress since the first post. For now, I'd only like to show what I've found when varying embeddings. Each configuration has its own level of variable pruning (3 total), and embedding size (based only on total # of qubits used, 30 unique per pruning method).

Concerning variable pruning, I really underestimated the effect it would have. I compared three methods of pruning: none, full (a task considering its position in a job and other task's processing time), and a limited pruning approach which is similar to no pruning for my test case.

I only received valid schedules when using full pruning. I was surprised that some embeddings seem to have performed considerably better than others that were similar in the total number of qubits used for the embedding (I have not yet looked further into the specific embeddings).

I posted this in another thread, but I also did a small test on chain strength for some of the above embeddings.

I did mention testing anneal offsets previously, but I don't have anything to show for it now. However, I have a few questions concerning how to best apply the parameter. My understanding is that for longer chains, delaying (negative offset) the anneal will be beneficial as the tunneling energy will be higher at a given point when compared to a standard offset, resulting in longer chains less likely to freeze out.

First, am I interpreting this correctly?

Second, the following is taken from the description of anneal offsets (https://docs.dwavesys.com/docs/latest/c_qpu_annealing.html#anneal-offsets) "Anneal offsets can also be useful in embedded problems with varying chain length: longer chains may freeze out earlier than shorter ones—at an intermediate point in the anneal, some variables act as fixed constants while others remain undecided. If, however, you advance the anneal of the qubits in the shorter chains, they freeze out earlier than they otherwise would. The correct offset will synchronize the annealing trajectory of shorter chains with that of the longer ones." Does this mean that offsets should be set differently for each chain?

Also, what is the recommended way to use anneal_offset? Currently I call minorminer.find_embedding directly with the target graph being the sampler graph excluding any qubits where its offset range does not support the offset that I am trying to anneal with. Are there any tools in Ocean that do this?

Thanks for the help and advice so far!

0

Comment actions Permalink
David J (Report)
July 09, 2019 17:37

Hello,

So for longer chains, the offset should actually be delayed, rather than sooner.

As the excerpt you included says, you longer chains should be delayed later in the anneal schedule, while shorter chains should also be delayed, but not as much.

So, yes chains of different lengths should have different offsets.

This is something that can be adjusted and fine tuned.

It might require a little trial and error for each unique embedding.

As for embedding, the way you describe it at the end of your last comment here is currently the best way to do it.

Excluding the nodes that don't have the range you need is clever.

Short of creating the embedding manually yourself starting from the Chimera graph itself, I think this is among the best methods we have available.

0

Comment actions Permalink
Florian H (Report)
August 14, 2019 12:09

Hey Mike

I continued my search for valid solution and found it useful is to make a grid search over the parameters chain_strength and "lagrange". The lagrange parameter was in my case a factor that controls the energy gap between valid and invalid solutions. (e.g. used here

https://github.com/dwavesystems/dwave_networkx/blob/33dcab42909d0d0557f965dfbfbc5d7abe0f10b2/dwave_networkx/algorithms/tsp.py#L120 )

The plots shows the fraction of valid solutions found. In the bottom right corner the chain strength is too small, such that embedded problem will have a new minimum.

So actually it seems very unlikely that one finds the sweet spot of an embedding with a simple search through different chain strengths.

1

Comment actions Permalink

Please sign in to leave a comment.

Didn't find what you were looking for?

New post