How to do product mix optimization in real-time

Not everything needs to be machine learning! Classical operations research methods can be quite good for optimizing processes, product-mixes, and logistics if you understand the constraints you are operating under. You can even operate on streaming data within an Apache Beam pipeline to achieve real-time control and agility.


Imagine that you have a manufacturing facility that can produce 4 types of products (A, B, C, and D) from 4 ingredients (dye, labor, water, and concentrate). Moreover, the value of producing A (the profit if you will) is $50, of B is $100 and so on:

Every hour, you get new supplies:

{"dye": 5710, "labor": 870, "water": 22350, "concentrate": 5010}

We want to maximize the total profit:

50A + 100B + 125C + 40D

How much of A, B, C, and D should you manufacture?

Cost function and Constraints

It depends on how much of each ingredient you need to produce the products. Let’s say that you need 50 mg of dye, 5 person-hours, 300 liters of water and 30 ml of concentrate to make product A and similarly, you know what you need for the other products:

Moreover, you can not ship more than 25 units of B per hour and not more than 10 units of C per hour.

These are the constraints. If you write them up, you will get an equations of this form for dye (since you received 5710 mg of it):

50A + 60B + 100C + 50D <= 5710

The bounds for A is (0, inf) whereas for B, it is (0, 25).

Cost function and Constraints

We can solve a constrained optimization problem like this using linear programming. scipy has a handy implementation:

def linopt(materials):
import numpy as np
from scipy.optimize import linprog
from scipy.optimize import OptimizeResult
# coefficients of optimization function to *minimize*
c = -1 * np.array([50, 100, 125, 40])
# constraints A_ub @x <= b_ub (could also use a_eq, b_eq, etc.)
A_ub = [
[50, 60, 100, 50],
[5, 25, 10, 5],
[300, 400, 800, 200],
[30, 75, 50, 20]
b_ub = [materials['dye'],
bounds = [
(0, np.inf),
(0, 25),
(0, 10),
(0, np.inf)
res = linprog(c, A_ub=A_ub, b_ub=b_ub, bounds=bounds)
return np.round(res.x)

Given the input materials:

{"dye": 5170, "labor": 700, "water": 29940, "concentrate": 5160}

the optimal product mix is:

[37. 10. 10. 35.]

Apache Beam pipeline

Wrap this in an Apache Beam pipeline, and you can basically solve the problem for every JSON message you receive in a Pub/Sub topic.

For trying it out, let’s use a text file:

    p = beam.Pipeline(options=options)
| 'ingest' >>
| 'optimize' >> beam.Map(lambda x : linopt(json.loads(x)))
| 'output' >>
result =

Carrying inventory

In reality, we’d want to carry some inventory. If we receive 30 mg of dye, and use only 28 mg, we’d keep the remaining 2 mg for the next iteration. We can do that with Apache Beam by maintaining a stateful variable:

class Inventory:
def __init__(self):
# assume water and labor can not be stored in inventory
self.dye = 0
self.concentrate = 0
def update(self, leftover):
self.dye = leftover[0]
self.concentrate = leftover[3]

The optimization method is modified to use and update the inventory:

def linopt(materials, inventory):
b_ub = [
materials['dye'] + inventory.dye,
materials['concentrate'] + inventory.concentrate

leftover = b_ub - np.matmul(A_ub, qty)

return qty

The pipeline is modified to use the inventory as a side input:

inventory = Inventory()(p
| 'ingest' >>
| 'parse' >> beam.Map(lambda x: json.loads(x))
| 'optimize' >> beam.Map(lambda x: linopt(x, inventory))
| 'output' >>

Now, the remaining inventory from the previous timestep is used in addition to materials received:

inventory from previous step: 
[1340.0, 5.0, 140.0, 278895.0]
material received:
{'dye': 5000, 'labor': 770, 'water': 20210, 'concentrate': 5300}
total available:
[6340.0, 770, 20210, 284195.0]
[ 0. 17. 0. 65.]
inventory for next step:
[2070.0, 20.0, 410.0, 281620.0]

The full code is on GitHub. Enjoy!




Operating Executive at a technology investment firm; articles are personal observations and not investment advice.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Production Code for Data Science and Our Experience with Kedro

7 Steps to Convert Big Data Into Actionable Intelligence

Confessions of a Fintech Data Engineer: An Interview With Sandra Real 👩‍💻

When we don’t know what we don’t know

Udacity Data Visualization Nanodegree Capstone Project

Data Commons & Data Trust

Topic Modeling on my Watched Movies

Lean Six Sigma and Its use in various occupations including Health Care

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Lak Lakshmanan

Lak Lakshmanan

Operating Executive at a technology investment firm; articles are personal observations and not investment advice.

More from Medium

Tracking And Analyzing Device Connections in Google Cloud IoT Core

Deployment Topologies for Data Fusion with Shared VPCs

Enabling data science on Google Cloud Platform at Adevinta

Aggregate Vertex AI model training logs in a BigQuery Table