How to filter the latest m2m objects using QuerySet in Django
I have the following model:
class Customer(SomeInheritedModel):
name = models. CharField(max_length=50)
...
class Account(SomeInheritedModel):
customer = models. ForeignKey(Customer, related_name='accounts')
...
class Product(SomeInheritedModel):
name = models. CharField(max_length=50)
...
class License(SomeInheritedModel):
account = models. ForeignKey(Account)
product = models. ForeignKey(Product)
maintenance = models. ManyToManyField('Maintenance', related_name="maintenances")
class Maintenance(SomeInheritedModel):
start_date = models. DateTimeField(null=True)
expiration_date = models. DateTimeField(null=True)
After the license maintenance is updated, a new Maintenance
object is created. In this way, I can trace all the maintenance
owned by a particular license
.
Now I want to generate a report that shows me all the customers
whose licenses
are expiring based on their Maintenance
expiration date. I only want the latest Maintenance
object owned by the License
because it was newly sold. I don’t want anyone else.
I
know I can achieve this with a QuerySet and a for loop, but it’s a bit expensive for the server because there are so many entries.
Is there a way to filter via QuerySet? Like this:
Customer.objects.filter(accounts__licenses__maintenances__expiry_date__last__range=(now().date(), one_month_into_future().date()))
I
know I can use __last
in some cases, but if I have to specify something afterward, that won’t work.
Edit
I found the answer with @hynekcer’s advice. You can use annotations
.
License.objects.filter(foo=True)
.annotate(max_exp_date=models. Max('maintenances__expiration_date'))\
.filter(max_exp_date__gte=report.start_date, max_exp_date__lte=report.end_date)\
.select_related('account__customer')
Solution
tl; dr )
The important answer at the moment is EDIT at the end.
Yes, it is important to ignore all old related objects (maintenance), even those that are still valid, as there may be updated maintenance.
I think you simplified your real model too much to the point that it didn’t work well.
You have two chained ManyToMany relationships with anonymous (non-explicit) relational tables. This makes it impossible to write the correct set of queries.
Error:
1) You use the same name (“licenses” and “maintenances”) for the field and its related_name
. This is nonsense because: docs
related_name
The name to use for the relation from the related object back to this one. It’s also the default value for related_query_name (the name to use for the reverse filter name from the target model).
It is useless to see a reverse query set maintenances
on the object Maintenance
. Similarly, the query set “License” regarding customer licensing. You can easily rename the related_name because it does not change the database and does not cause migration.
2) Is the License
a public object or a separate object? If it is personal, then it does not need a many-to-many
relationship with the Customer object. If it’s common, then you can’t track paid maintenance to individual customers through it. (You also don’t mean that two customers are co-owners of one license!) Is it? You may be referring to a common licensedProduct
and a connected individual licensee
to use the product. I know users can buy one maintenance for more licenses, and many-to-many is fine here.
First I fixed the model (somehow I guessed before I asked you).
class Customer(SomeInheritedModel):
# "licenses" is the reverse query to License
# optionally you can enable many-to-many relation to licensed products
# lic_products = models. ManyToManyField(
class Product(models. Model):
pass # licensed product details
class License(SomeInheritedModel):
customer = models. ForeignKey(Customer, related_name='licenses')
product = models. ForeignKey(Product, related_name='+') # used '+' because not deeded
maintenances = models. ManyToManyField(
Maintenance,
through='LicenseMaintenance',
through_fields=('license', 'maintenance'),
related_name='licenses')
class Maintenance(SomeInheritedModel):
start_date = DateTimeField(null=True)
expiration_date = DateTimeField(null=True)
class LicenseMaintenance(models. Model):
license = models. ForeignKey(License, on_delete=models. CASCADE)
maintenance = models. ForeignKey(Maintenance, on_delete=models. CASCADE)
Querysets 🙁 can be simplified a lot by removing order_by and related fields).
remind_start = datetime.datetime.now(tz=TIMEZONE)
remind_end = remind_start + datetime.timedelta(days=30)
expiring_lic_maintenances = (
LicenseMaintenance.objects.values('license',
'license__customer',
'license__customer__name')
.annotate(max_exp_date=models. Max('maintenance__expiration_date'))
.filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
.order_by('license__customer__name', 'license__customer', 'license')
) # some small detail can be used like e.g. customer name in the example, not used later
expiring_licenses = (
License.objects.filter(
license__in=expiring_lic_maintenances.values_list('license', flat=True))
.select_related('customer', 'product')
.order_by('license__customer__name', 'license__customer', 'license')
) # that queryset with subquery is executed by one SQL command
No
more than two SQL requests are executed by running these query sets:
# but I prefer a simple map and queryset with subquery:
expiration_map = {x.license_id: x.max_exp_date for x in expiring_lic_maintenances}
for lic in expiring_licenses:
print("{name}, your maintenance for {lic_name} is expiring on {exp_date}".format(
name=lic.customer.name,
lic_name=lic.product.name,
exp_date=expiration_map[lic.id],
))
I hope this is a new project and you don’t need to migrate for the modified model yet. I’ve written similar code so many times that I’m not verifying it now. Errors may occur and you can easily notify me enough time before the bounty ends.
Edit after question:
Aggregate functions work fine on many-to-many fields that do not explicitly join table models in the current version of Django:
>>> expiring = (
... License.objects.values('id',
... 'account__customer',
... 'account__customer__name')
... . annotate(max_exp_date=models. Max('maintenance__expiration_date'))
... . filter(max_exp_date__gte=remind_start, max_exp_date__lte=remind_end)
... )
and look at the compiled SQL:
>>> str(expiring.query)
SELECT app_license.id, app_account.customer_id, app_customer.name, MAX(app_maintenance.expiration_date) AS max_exp_date
FROM app_license INNER JOIN app_account ON (app_license.account_id = app_account.id)
INNER JOIN app_customer ON (app_account.customer_id = app_customer.id)
LEFT OUTER JOIN app_license_maintenance ON (app_license.id = app_license_maintenance.license_id)
LEFT OUTER JOIN app_maintenance ON (app_license_maintenance.maintenance_id = app_maintenance.id)
GROUP BY app_license.id, app_account.customer_id, app_customer.name
HAVING (MAX(app_maintenance.expiration_date) >= 2017-04-07T13:45:35.485755 AND
MAX(app_maintenance.expiration_date) <= 2017-03-08T13:45:35.485755
)
Usually this is compiled by two external connections.
If you find a more complex case where it doesn’t work or the query is slow because for some database engines tuning with outer joins is more complex, you can get the implicit model every time and run queries on it because it’s the top-level model in the relational hierarchy:
We can explore the implicit intermediate model of tables:
>>> License.maintenance.through
app.models.License_maintenance
>>> LicenseMaintenance = License.maintenance.through
>>> LicenseMaintenance._meta.fields
(<django.db.models.fields.AutoField: id>,
<django.db.models.fields.related.ForeignKey: license>,
<django.db.models.fields.related.ForeignKey: maintenance>)
And use it :(all connections are automatically compiled as internal joins).
>>> expiring = (
... LicenseMaintenance.objects.values('license',
... 'license__account__customer',
... 'license__account__customer__name')
... . annotate(max_exp_date=models. Max('maintenance__expiration_date'))
... . filter(max_exp_date__lte=remind_start, max_exp_date__gte=remind_end)
... )