Python – Django reverse m2m queries

Django reverse m2m queries… here is a solution to the problem.

Django reverse m2m queries

Use > from model has been modified slightly:

from django.db import models

class Blog(models. Model):
    name = models. CharField(max_length=100)

class Author(models. Model):
    name = models. CharField(max_length=200)
    joined = models. DateField()

def __str__(self):
        return self.name

class Entry(models. Model):
    blog = models. ForeignKey(Blog, on_delete=models. CASCADE)
    headline = models. CharField(max_length=255)
    authors = models. ManyToManyField(Author)
    rating = models. IntegerField()

I want to create a dictionary from author to entry, the author joins this year, and the article has a rating of 4 or higher. The structure of the resulting dictionary should look like this:

author_entries = {author1: [set of entries], author2: [set of entries], etc.}

The database is accessed less than 3 times (or at least out of proportion to the number of authors or entries).

My first attempt (database hits == number of authors, 100 authors 100 database hits):

    res = {}
    authors = Author.objects.filter(joined__year=date.today().year)

for author in authors:
        res[author] = set(author.entry_set.filter(rating__gte=4))

Second attempt, try to read the entry once:

    res = {}
    authors = Author.objects.filter(joined__year=date.today().year)
    entries = Entry.objects.select_related().filter(rating__gte=4, authors__in=authors)

for author in authors:
        res[author] = {e for e in entries if e.authors.filter(pk=author.pk)}

This one is even worse, 100 authors, 198 db-hits (the original second attempt used {e for e in entries if author in e.authors}, but Django didn’t.

).

The only method I found involved raw-sql(4 db-hits):

    res = {}
    _authors = Author.objects.filter(joined__year=date.today().year)
    _entries = Entry.objects.select_related().filter(rating__gte=4, authors__in=_authors)
    authors = {a.id: a for a in _authors}
    entries = {e.id: e for e in _entries}
    c = connection.cursor()
    c.execute("""
        select entry_id, author_id 
        from sampleapp_entry_authors
        where author_id in (%s)
    """ % ','.join(str(v) for v in authors.keys()))

res = {a: set() for a in _authors}
    for eid, aid in c.fetchall():
        if eid in entries:
            res[authors[aid]].add(entries[eid])

(Sorry for the string substitution used in the c.execute(..) call – I can’t find the syntax required for the where in? call sqlite ).)

Is there a more Djangoesque way to do this?

I’ve created a git repository with the code I’m using ( https://github.com/thebjorn/revm2m ). https://github.com/thebjorn/revm2m/blob/master/revm2m/sampleapp/tests.py Medium

Solution

You can use Prefetch-object [Django-doc] for this:

from django.db.models import <b>Prefetch</b>

good_ratings = <b>Prefetch</b>(
    'entry_set',
    queryset=Entry.objects.filter(rating__gte=4),
    to_attr='good_ratings'
)

authors = Author.objects.filter(
    joined__year=date.today().year
).prefetch_related(
    good_ratings
)

The Author object in authors will now have an additional property good_ratings(to_attr value Prefetch object) is a preloaded QuerySet. It contains Entries with a rating greater than or equal to 4.

So you can post-process these like this:

res = {
    author: set(author.good_ratings)
    for author in authors
}

Although because the Author object (from this QuerySet, not the generic one) already carries properties, it may not be of much use.

Related Problems and Solutions