I've got three related models: Process, Factor and Level. A Process has a many-to-many relationship with Factors and a Factor will have one or more Levels. I'm attempting to calculate all combinations of Levels involved with a Process. This is straightforward to implement with Python's itertools as a model method, but is somewhat slow in execution speed, so I'm trying to figure out how to use the Django ORM to perform this calculation in SQL.
Models:
class Process(models.Model):
factors = models.ManyToManyField(Factor, blank = True)
class Factor(models.Model):
...
class Level(models.Model):
factor = models.ForeignKey(Factor, on_delete=models.CASCADE)
Example: A process 'Running' involves three Factors ('Distance', 'Climb', 'Surface') each composed of a number of Levels ('Long'/'Short', 'Flat'/'Hilly', 'Road'/'Mixed'/'Trail'). Calculating the combinations in SQL would involve building a query by first determing how many Factors were involved (3 in this example) and performing a CROSS JOIN of all the levels that many times.
In SQL, this could be accomplished as such:
WITH foo AS
(SELECT * FROM Level
WHERE Level.factor_id IN
(SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1)
)
SELECT a1.*, a2.*, a3.*
FROM foo a1
CROSS JOIN foo a2
CROSS JOIN foo a3
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id)
Result:
a1.name | a2.name | a3.name
--------------------------
Long | Flat | Road
Long | Flat | Mixed
Long | Flat | Trail
Long | Hilly | Road
Long | Hilly | Mixed
Long | Hilly | Trail
Short | Flat | Road
Short | Flat | Mixed
Short | Flat | Trail
Short | Hilly | Road
Short | Hilly | Mixed
Short | Hilly | Trail
Currently, I have this implemented as a method on the Process model as:
def level_combinations(self):
levels = []
for factor in self.factors.all():
levels.append(Level.objects.filter(factor = factor))
combinations = []
for levels in itertools.product(*levels):
combination = {}
combination["levels"] = levels
combinations.append(combination)
return combinations
Is this possible using the Django ORM or is it complex enough that it should be implemented as a raw query to improve speed over the Python code implementation?
There was a similiar question about performing CROSS JOIN in Django ORM from several years ago (approx. Django v1.3 it looks like) didn't attract appear to attract much attention (the author punted to just using Python itertools).
from itertools import groupby, product
def level_combinations(self):
# We need order by factor_id for proper grouping
levels = Level.objects.filter(factor__process=self).order_by('factor_id')
# [{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...},
# {'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]
groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)]
# [[{'name': 'Long', 'factor_id': 1, ...},
# {'name': 'Short', 'factor_id': 1, ...}],
# [{'name': 'Flat', 'factor_id': 2, ...},
# {'name': 'Hilly', 'factor_id': 2, ...}]]
# Note: don't forget, that product is iterator/generator, not list
return product(*groups)
If order doesn't matter, then:
def level_combinations(self):
levels = Level.objects.filter(factor__process=self)
groups = {}
for level in levels:
groups.setdefault(level.factor_id, []).append(level)
return product(*groups.values())
A few years late, this workaround does not actually use a CROSS JOIN, but it does produce the desired result in a single query.
Step 1: add a cross field to your Factor model
class Factor(models.Model):
cross = models.ForeignKey(
to='self', on_delete=models.CASCADE, null=True, blank=True)
...
Step 2: link 'Climb' to 'Surface', and link 'Distance' to 'Climb', using the new Factor.cross field
Step 3: query as follows
Level.objects.filter(factor__name='Distance').values_list(
'name', 'factor__cross__level__name', 'factor__cross__cross__level__name')
Result:
('Long', 'Flat', 'Road')
('Long', 'Flat', 'Mixed')
('Long', 'Flat', 'Trail')
('Long', 'Hilly', 'Road')
('Long', 'Hilly', 'Mixed')
('Long', 'Hilly', 'Trail')
('Short', 'Flat', 'Road')
('Short', 'Flat', 'Mixed')
('Short', 'Flat', 'Trail')
('Short', 'Hilly', 'Road')
('Short', 'Hilly', 'Mixed')
('Short', 'Hilly', 'Trail')
This is a simplified example. To make it more generic, instead of adding the Factor.cross field, you could add a new CrossedFactors model with two foreignkeys to Factor. This model can then be used to define various experimental designs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With