I've been getting into linear programming in Python latetly, and I created my first optimization alrogithm with PuLP.
I am dealing with a scheduling problem for a production process. The goal is to minimize production cost per day, by creating an ideal production schedule for each hour of the day, and create this schedule for all days in the year.
The problem I am experiencing is that execution of the algorithm takes very long (several hours) and often gets stuck. Also, I have the feeling it gets slower as time progresses.
I am hoping to get advice on how I can improve the performance of my code.
My approach to the problem:
I am dealing with 3 production asset ('a', 'l' and 'o'), each of which have several production modes. I defined each asset-mode combination as an 'option', resulting in 14 options in total. Each option can vary every hour and has an integer value (amount of production) and a binary value (on/off), resulting in around 14 x 2 x 24 = 672 variables. The problem contains around 1250 constraints.
My code has 200+ lines so I am a bit hesitant to share all of it here, but I'll share the most important bits below.
Defining supply options:
def set_supplyoptions():
cols = ['option', 'min_capacity', 'max_capacity']
options_list = [{'option':'o', 'min_capacity': 0, 'max_capacity':146},
{'option':'l30', 'min_capacity': 0, 'max_capacity':30},
{'option':'l50', 'min_capacity': 31, 'max_capacity':50},
{'option':'l90', 'min_capacity': 51, 'max_capacity':90},
{'option':'l150', 'min_capacity': 91, 'max_capacity':150},
{'option':'l230', 'min_capacity': 151, 'max_capacity':230},
{'option':'a15', 'min_capacity': 0, 'max_capacity':15},
{'option':'a30', 'min_capacity': 0, 'max_capacity':30},
{'option':'a45', 'min_capacity': 0, 'max_capacity':45},
{'option':'a60_below53', 'min_capacity': 0, 'max_capacity':52},
{'option':'a_flow_below53', 'min_capacity': 0, 'max_capacity':52},
{'option':'a_flow_above53', 'min_capacity': 0, 'max_capacity':8},
{'option':'l_total', 'min_capacity': 0, 'max_capacity':230},
{'option':'a_total', 'min_capacity': 0, 'max_capacity':60}]
options = pd.DataFrame(options_list, columns=cols)
options = options.set_index('option')
Variables:
# production per supply option per hour
production = pulp.LpVariable.dicts("production",
((hour, option) for hour in hours for option in options.index),
lowBound=0,
cat='Integer')
# status of supply options per hour (in use or not in use)
options_status = pulp.LpVariable.dicts("options_status",
((hour, option) for hour in hours for option in options.index),
cat='Binary')
# use status of A tranches on day (used or not used)
a_status_15 = pulp.LpVariable('a_stat_15', cat='Binary')
a_status_30 = pulp.LpVariable('a_stat_30', cat='Binary')
a_status_45 = pulp.LpVariable('a_stat_45', cat='Binary')
a_status_60 = pulp.LpVariable('a_stat_60', cat='Binary')
Objective function:
opt_model = pulp.LpProblem("Optimizer", pulp.LpMinimize)
opt_model += pulp.lpSum(
#O variable
[0.2*demand.loc[hour, 'price']*production[hour, 'o'] +
#O fixed
3*demand.loc[hour, 'price'] +
#L30
( 12 )*options_status[hour, 'l30']*demand.loc[hour, 'price'] +
#L50
( 2*options_status[hour, 'l50']+0.13*production[hour, 'l50'] )*demand.loc[hour, 'price'] +
#L90
( 15*options_status[hour, 'l90']+0.13*production[hour, 'l90'] )*demand.loc[hour, 'price'] +
#L150
( 8*options_status[hour, 'l150']+0.2*production[hour, 'l150'] )*demand.loc[hour, 'price'] +
#L230
( -3*options_status[hour, 'l230']+0.3*production[hour, 'l230'] )*demand.loc[hour, 'price'] +
#L fixed
( 7*(1-options_status[hour, 'ltotal'])*demand.loc[hour, 'price'] ) +
#A base unit price
(10*production[hour, 'a_flow_below53']+8.88*production[hour, 'a_flow_above53'])*(c/20) for hour in hours]
#A daily fee
+ (a_status_15 * 1000 + a_status_30 * 2000 + a_status_45 * 3 + a_status_60 * 4000)*(c/20))
Constraints:
# Sum of production == Demand
for hour in hours:
opt_model += production[(hour, 'o')] + production[(hour, 'l_total')] + production[(hour, 'a_total')] == int(demand.loc[hour, 'demand'])
# Production <= Max capacity AND => Min capacity
for hour in hours:
for option in options.index:
opt_model += production[(hour, option)] >= options_status[(hour, option)]
opt_model += production[(hour, option)] >= options.loc[option, 'min_capacity'] * options_status[(hour, option)]
opt_model += production[(hour, option)] <= options.loc[option, 'max_capacity'] * options_status[(hour, option)]
# Constraints L
for hour in hours:
opt_model += production[(hour, 'l30')] + production[(hour, 'l0')] + production[(hour, 'l90')] + production[(hour, 'l150')] + production[(hour, 'l230')] == production[(hour, 'l_total')]
opt_model += options_status[(hour, 'l30')] + options_status[(hour, 'l50')] + options_status[(hour, 'l90')] + options_status[(hour, 'l150')] + options_status[(hour, 'l230')] <= 1
# Constraints A
M = 999
for hour in hours:
# total flow = sum of tranches
opt_model += production[(hour, 'a_flow_below53')] == production[(hour, 'a_15')] + production[(hour, 'ap_30')] + production[(hour, 'a_45')] + production[(hour, 'ap_60_below53')]
opt_model += production[(hour, 'a_total')] == production[(hour, 'a_flow_below53')] + production[(hour, 'a_flow_above53')]
# only 1 tranche can be active at the time
opt_model += production[(hour, 'a_15')] <= M * a_status_15
opt_model += production[(hour, 'a_30')] <= M * a_status_30
opt_model += production[(hour, 'a_45')] <= M * a_status_45
opt_model += production[(hour, 'a_60_below53')] + production[(hour, 'a_flow_above53')] <= M * a_status_60
# a_60_above53 can only be active if a_60_below53 == 52
opt_model += production[(hour, 'a_flow_above53')] <= M * options_status[(hour, 'a_flow_above53')]
opt_model += (52 - production[(hour, 'a_flow_below53')] ) <= M * (1-options_status[(hour, 'a_flow_above53')])
# only 1 'mode' can be used per day (corresponding to daily fee)
opt_model += a_status_15 + a_status_30 + a_status_45 + a_status_60 <= 1
LP problems are very fast to solve unless you have a very large number of variables. It is the integer and binary variables which slow you down.
Most likely it is your production variables that are killing you. I would recommend treating these as linear / continuous variables ("linear relaxation"), to get a solution quickly (can then round down/up to get a feasible solution).
The other option might be to increase the allowable optimality gap, or set a timeout limit. The default optimality gap will be quite low which will mean the branch-and-bound solver for the MILP will just keep going. Solvers often obtain good solutions relatively quickly and then spend a huge about of time trying to prove the optimality of the solution. - You can provide a timeout to the solver to get an approximate solution:
prob.solve(pulp.solvers.PULP_CBC_CMD(maxSeconds=120))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With