Decoupling Your Business Logic from the Django ORM
Where should I keep my business logic? This is a perennial topic in Django.
We can imagine a continuum of cases, with increasing complexity: I begin with my logic just in a view; as that grows I maybe move it into a form; from there into custom model or manager methods; finally I move my logic into some kind of service layer, independent of the ORM.
This was a flow we discussed on The Stack Report in Growing your business logic with Django. I argued there that:
The challenge is not, not really, where we put our logic, but rather how do we evolve that as our application grows? How do we make sure we can scale our code into the medium term?
Of the flow — from view to service layer — it’s the custom model and manager methods to service layer step that draws the most attention. There’s plenty to say about the earlier stages, where we’re exploring what our code is going to look like, but it’s easy to see that ”just throw it in the view” is not a long-term equilibrium position for a growing app.
There was a recent Django Forum thread on the effort of switching between model and manager methods and a service layer type approach. My own take there was that it’s not so much about the switching effort as keeping your code clean as you go:
The key point that I focus on is that there should only be one key pathway where particular business rules are enforced. “When X happens, Y must also happen”. The big issue is not so much where you put your code, but whether you’re having to update multiple locations whenever X or Y change.
The difficulties we have are when changes start cascading out, because we’ve not ensured that logic is suitably encapsulated in a single place. If we have that discipline, adopting a higher abstraction level is generally OK; where we haven’t, it’s not.
I was pleased in that discussion to get to link again to one of my all time favourite posts on this topic: Django models, encapsulation and data integrity. That’s very much worth your time, and probably as clear a statement of Django’s Fat Models, Thin Views approach as you could care to read.
Django’s design philosophies recommend models follow what’s called the Active Record pattern. The model encapsulates access to a single row in the database table, and we add our domain logic on top of that in custom methods and properties. Model managers act as a Table Data Gateway for table level logic, and together with custom QuerySets, and lookups, enable a rich and expressive domain-specific API for, probably, the vast majority of applications.
When folks say why they love Django, it’s this aspect of the ORM that is most often cited.
§
The problem comes (again) when you start to get too much going on in one place.
We begin with a simple model for a Bookmark. We add a __str__ implementation, so that it’s output nicely. Then you add some helpers: related_bookmarks(), get_summary(), etc. Manager methods for common queries. Properties for template convenience: a list_display, a detail_display, a preview. And so on.
The details aren’t so important. Rather, it’s that there’s a lot going on here, and various concerns in play. It becomes hard to see what’s what. Logic from one concern gets accidentally reused for a separate concern. We don’t want to repeat ourselves after all, and we’re starting to feel uncomfortable somehow. But then those two concerns are coupled. Editing becomes difficult. Our tests are tied to the database, which is fine, but they maybe start to feel slow. We maybe find our methods querying over related objects — a bookmark’s parent user, say — and dealing with select or prefetch related calls that take some management. There’s maybe a nagging feeling that our domain logic isn’t mapping one-to-one down to database tables anymore. And all the time, the very complex job of handling the persistence down to the database — which the ORM does very well, no problem with that — gets increasingly obscured by the many other things that are happening.
If you’ve grown a Django application to any moderate size, I’m sure you’re familiar with the feeling here.
We might use Django’s proxy models to split domain logic by concern, and I think there’s good milage in that, but ultimately that buys us only some additional space. At some point we need to tease off our business logic from the ORM — we want to decouple what’s our application’s concern from the mapping to the database. Both of these are significant concerns, with significant complexity. As we scale it becomes requisite to keep them apart.
I want to talk today about how I handle this point in my own applications. I hope to show that, yes, there’s no great barrier to moving logic from our ORM models, and, as such, reaffirm that it’s not something we need to worry about handling prematurely. It’s a step that’s readily available when we need it.
§
Let’s flesh out our Bookmark model:
from django.contrib.auth.models import User
from django.db import models
class Bookmark(models.Model):
url = models.URLField(unique=True)
title = models.CharField(max_length=255)
note = models.TextField(blank=True)
favourite = models.BooleanField(default=False)
user = models.ForeignKey(
User, related_name="bookmarks")
It’s got a url, a title, a note field, whether it’s a favourite or not, and a foreign key to our User model for the parent user.
It’s here that we add our helpers. And again, just to repeat, to begin with all is rosy. A single method or property, wonderful. A couple, still great. Even a few, no problem. But as our application grows, we reach a point where there’s just too much going on here. I’m not going to show it, but I want you to imagine it. You’ve seen it in your own projects, in your own models.
Beyond everything else, it’s probably this tendency for models to grow arbitrary logic from, often overlapping and, competing concerns that gives the complaints from folks who aren’t fans of the Django ORM the most teeth. Complexity management is the eternal challenge — regardless of your framework. This is just how/where it most commonly shows its face in Django.
Whilst we’re here it’s worth pausing to look at two related points about the ORM that draw criticism.
One is the default unrestricted nature of field fetches when querying. Say I query for a bookmark:
>>> b = Bookmark.objects.get(user__username='carlton')
The ORM will fetch the data for every field on the Bookmark model, regardless of whether I want to use it or not. The generated SQL will look like this:
SELECT "example_bookmark"."id",
"example_bookmark"."url",
"example_bookmark"."title",
"example_bookmark"."note",
"example_bookmark"."favourite"
"example_bookmark"."user_id"
FROM "example_bookmark"
...
The ... there is for the inner join the ORM will add to filter by the related username. (Nothing wrong with that.)
The point is that, for a list view, say, I may want only the id to generate the URL, plus the title and whether it’s a favourite or not, as an example, for the preview. The other fields are, in this case, just dead weight. We might imagine the note field being quite heavyweight. That it’s fetched from the database, sent over the wire, loaded into memory, possibly numerous times in a list view, just to be thrown away is overhead we could do without. The ORM let’s you control this with only() and defer(), but these are tools which are often not used at all, and can be fragile, and difficult to use correctly, even when they are.
The second complaint then concerns lazy related lookups. If I have my bookmark, I can straightforwardly access the related user:
>>> b = Bookmark.objects.get(user__username='carlton')
>>> b.user
<User: carlton>
This access causes the ORM to fetch data for the related user:
SELECT "auth_user"."id",
"auth_user"."password",
"auth_user"."last_login",
"auth_user"."is_superuser",
"auth_user"."username",
"auth_user"."first_name",
"auth_user"."last_name",
"auth_user"."email",
"auth_user"."is_staff",
"auth_user"."is_active",
"auth_user"."date_joined"
FROM "auth_user"
WHERE "auth_user"."id" = 1
LIMIT 21
Again, we’ve fetched all the fields here, when all I might be interested in is, say, the username, but, as convenient as it is, that you can trigger an SQL query from a mere attribute access makes it hard to control when you’re hitting the database, which is the most performance sensitive part of pretty much any web application.
In a list view we’ll often iterate all our bookmarks with the equivalent of this:
for bookmark in Bookmark.objects.all():
print(bookmark.user.username)
Be it in a template, generating HTML, or a form, generating model choices, or wherever, this structure means we make one query for our bookmarks followed by one extra query per bookmark for the related user. This is the so-called N+1 query problem, and avoiding it is Django optimisation 101, but the ORM’s lazy related fetches make it an easy problem to fall into.
Now, again, the ORM gives us tools to deal with this, in select and prefetch related, and from Django 6.1 QuerySets will be able to specify a fetch mode to automatically fetch all peers when an related field is accessed, say, but these tools are still opt-in — they require careful use and can be fiddly to keep updated as your models evolve.
I’m largely assuming that this is familiar ground to you. We’ve got a model that’s growing more logic than we feel comfortable with — that’s the main problem — and then we’ve got these default performance sensitive behaviours of the ORM that can be awkward to deal with, and can (and do) trip us up as we go.
§
What we want is a separate class with (no more than) the exact data it needs for the task it has to deal with. For our list view, something like this:
from attrs import define
@define
class BookmarkData:
id: int
title: str
favourite: bool
This is just an attrs class. Plain Python, fully typed, no Django in sight. (Again, I’m not going to show them, but we’d add our domain logic methods — those just for the list view — here.)
In order to map between our Bookmark model and our BookmarkData class, I want to show you Django Mantle.
Mantle:
- Allows you to move your business logic into type-safe Python classes, decoupled from the Django ORM.
- Provides automatic generation (with declarative overrides) of efficient ORM queries, including limited field fetches (
only()/defer()) and prefetching related objects, avoiding N+1 query problems.- Uses a modern and performant approach to serialisation and validation.
- Provides a progressive API, with a minimal surface area by default, and depth when needed.
It’s your type-safe layer around Django’s liquid core.
Mantle looks like this. We instantiate a Query object with a Django QuerySet and an attrs shape that we want our data mapped to. We can then fetch a fully typed list of data:
from mantle import Query
# Query takes a Django QuerySet and a "shape class"
# to map the data to.
query = Query(Bookmark.objects.all(), BookmarkData)
# Fetch all
bookmarks = query.all()
# reveal_type(bookmarks) -> list[BookmarkData]
Same again if I want to fetch just one instance:
query = Query(
Bookmark.objects.filter(username="carlton"),
BookmarkData
)
bookmark = query.get()
# reveal_type(bookmark) -> BookmarkData
In each case, the generated ORM query selects only/exactly the fields that are required by the BookmarkData class, and no more — so there are no unrestricted fetches of all fields, regardless of whether they’re used or not.
§
The same applies to nested data:
from attrs import define
@define
class UserData
username: str
@define
class BookmarkData:
id: int
title: str
favourite: bool
user: UserData # Added the UserData here
Here, we want to add the username for the related user, so we define the nested shape class and add it to our BookmarkData.
When we Query for the data, the user.username is populated as we need:
query = Query(
Bookmark.objects.filter(username="carlton"),
BookmarkData # with nested user
)
bookmark: BookmarkData = query.get()
print(bookmark.user.username) # -> carlton
And the generated ORM query automatically prefetches the related data, so there’s a minimum of lookups. Even if we’re iterating over lists of objects, there are no lazy related object fetches, and so no N+1 query problems.
§
I’ll give you one more example. Under the hood, Mantle uses django-readers for interacting with the ORM. (I highly recommend checking out django-readers. It’s well worth your time!)
In order to do its work, readers uses what it calls a spec, which is a list of 2-tuples with what you need to do to prepare the QuerySet — include a field say — and how you fetch the value you need from the returned data. (In the simplest case, where you just want to include a field, and select its value, you can just use the field name instead of the 2-tuple.)
What Mantle does is prepare the readers spec from your attrs class, prepare the queryset, get the data, and then map it back to attrs instances, using cattrs.
The example here just shows customising the readers spec generation. It’s the progressive API: in simple cases you don’t need it, but it opens up if you do:
from attrs import define
from django_readers import producers, qs
from mantle import overrides, Query
@overrides({
"username": (
qs.annotate(username=F("user__username")),
producers.attr("username"),
)
})
@define
class BookmarkData:
id: int
title: str
favourite: bool
username: str
In the example here, say we don’t want the nested user data, but instead want to bring the username up onto the BookmarkData. We can provide an override to the spec generation that says, annotate the QuerySet with the username value, and then select that to include in our return data.
Then if we fetch our data again, you see the username is available directly as a top level attribute:
query = Query(
Bookmark.objects.filter(username="carlton"),
BookmarkData # annotated version
)
bookmark: BookmarkData = query.get()
print(bookmark.username) # -> carlton
There’s lots you can do with overrides. The Mantle docs have a section on Customising query generation , but annotation like this is maybe the 90% use-case.
§
Let me just show you about writes, with validation.
If we have incoming data — along the parse, don’t validate approach, we structure that into attrs classes using cattrs.
That tells us that our data has the right shape: the right fields, the right data types, and so on.
But we might need an extra layer of domain validation on top of that — Rules which go beyond what’s purely internal to the data.
In Django it’s common to have unique or unique together constraints for some field or other.
Here we define a bookmark_validator that ensures that the url field for our bookmark is unique:
from mantle import (
compose_validators,
create,
update,
unique_field,
)
# Optional domain validation for create/update.
bookmark_validator = compose_validators(
unique_field("url"),
)
With that in place, we can use our BookmarkData to create a new Django bookmark instance, checking the URL uniqueness constraint beforehand:
created = create(
Bookmark,
BookmarkData(
url="https://noumenal.es/mantle/",
title="Mantle Docs",
favourite=False,
),
validator=bookmark_validator,
)
# reveal_type(created) # Bookmark
In view code, you’d catch an error there, and report that to the user.
Same with updates. We realise we should have made the Mantle docs a favourite, so we take our existing instance and update it accordingly:
updated = update(
created,
BookmarkData(
url="https://noumenal.es/mantle/",
title="Mantle Docs",
favourite=True,
),
validator=bookmark_validator,
)
# reveal_type(updated) # Bookmark
It works how you think it should.
§
Final then, if you’re using Django REST Framework, there’s an integration there via django-mantle-drf.
Mantle DRF provides versions of DRF’s generic views, mixins, and viewsets, replacing DRF serialisers with the mantle flow.
Here we define a BookmarkDetail, which is a RetrieveAPIView, which we imported from mantle_drf.generics instead of rest_framework.generics:
from drf_spectacular.utils import extend_schema
from mantle_drf.generics import RetrieveAPIView
from .shapes import BookmarkShape
class BookmarkDetail(RetrieveAPIView):
queryset = Bookmark.objects.all()
shape_class = BookmarkShape
@extend_schema(responses=BookmarkShape)
def get(self, request, *args, **kwargs):
return self.retrieve(request, *args, **kwargs)
We define our shape_class which is our attrs class for our data shape, and we’re done.
You’ll see extend_schema decorator from drf-spectacular there because there’s integration with that for OpenAPI schema generation. Mantle DRF means you can begin working with Mantle in your existing DRF application.
§
And that’s about it as a quick(ish) overview.
What Mantle gives you is a declarative way of teasing apart your business logic from the ORM case by case. For our Bookmark we had multiple roles that the one model had to play in. For each of those roles we define a separate shape class, and pull only the logic that’s relevant to it into that. With less to see, it’s easier to reason about. Free from the ORM it’s easier to test. And being built around an attrs class it’s fully typed from the start.
There’s lots more to say and do here, but Mantle gives you those few next steps for when your app is becoming that little bit more complex. I’m having great fun with it. Do have a play. And let me know how you get on.