How to prune variables for large-scale models?

I understand the concept of pruning variables, as mentioned, for example, [here](https://arxiv.org/pdf/1506.08479.pdf) and in this question (https://support.dwavesys.com/hc/en-us/community/posts/360031592014). But, I'm not quite sure how to go about implementing it. The implementations I've seen is to simply maintain all of the original variables, and just fix the pruned ones to be 0 if they're infeasible (e.g. https://github.com/dwave-examples/job-shop-scheduling/blob/master/job_shop_scheduler.py).   

But, in my case I have too many variables so that I can't even add them to the model in the first place. (Something like 10^6 variables). If I run a list comprehension to try to add these to the model, it runs for too long. What is the proper way to handle this? Is there either: 

  1. A proper way to add 10^6 variables to a model? 
  1. A better way to prune variables *beforehand*, so that we only add necessary variables to the model? 

 The best solution I can think of is to precompute a list that stores all the indices of the remaining variables after pruning. Then, create the model with only those variable indices. And, when forming the constraints, we can check if a variable is in that list of indices. If not, then don't include it in the constraint. 

Is there another, better way to prune variables for large models?   

0

Comments

3 comments
  • Hey Jordan

    I saw your comment on my other topic. To be honest, I don't have much experience with large models like the kind you have, so I don't know if I will be able to help much.

    If I recall correctly, my workflow was as you described. First, I created the BQM with all variables, and then pruned some by fixing their values later. Since my problem size was small, I picked this order simply out of convenience. If you are already having problems creating a BQM with that many variables, it's probably better to avoid creating the variables in the first place, than to create and prune them.

    I also remember using some libraries (probably D-Wave Binary CSP or PyQUBO) to create BQMs. The time it took for the libraries to convert a problem description to a BQM could be pretty long, so I switched over to directly calculating the biases/quadratic terms, storing the values in a dict, and then creating a BQM from the dictionaries. This was much faster and might be worth looking into if you are not already doing it.

    0
    Comment actions Permalink
  • Thanks for your response.  If I understand you correctly, your approach seems to be similar to just computing the actual entries of the QUBO matrix, and then using that to create the BQM.  (Let's call that approach 1).  From the job shop scheduling problem it seems like you can enter constraints algebraically, and then `dwavebinarycsp` has methods to remove constraints automatically if you fix them to a particular value (using the method `fix_variable`.  (call that approach using `dwavebinarycsp` approach 2).  Are you saying that approach 1 was much faster than approach 2?  

    0
    Comment actions Permalink
  • Yes, at that time I found approach 1 to be faster than approach 2. Since it was some time ago, and formulation time is always important to these types of libraries, I would encourage you to test them yourself in case there have been any changes that impact formulation.

    Also, I believe fix_variable is a method of a dimod BQM, so you can use it in approach 1 if you don't already avoid creating the variables in the first place.

    1
    Comment actions Permalink

Please sign in to leave a comment.

Didn't find what you were looking for?

New post