Buttondown is an application for writing, sending, and growing newsletters — though that part isn't quite applicable to this essay! The more important part: Buttondown has a Python & Django backend of around 38,000 lines and it's been around for around four years, so it has a good-but-not-excessive amount of cruft and abstractions that I wish I could take back.
The short answer: python : mypy :: JavaScript : TypeScript
. [^1] The longer answer, as provided by the official mypy docs:
Mypy is an optional static type checker for Python that aims to combine the benefits of dynamic (or "duck") typing and static typing. Mypy combines the expressive power and convenience of Python with a powerful type system and compile-time type checking. Mypy type checks standard Python programs; run them using any Python VM with basically no runtime overhead.
I've actually been using something that's, like, mypy-adjacent for a while now. PyCharm was for a long time my Python IDE of choice [^2] and it had really strong support for type hints, meaning that a declaration like:
def count_words(input: str) -> int: return len(input.split())
would be enough to provide PyCharm with information for callers of count_words
, such that I would get a very angry red squiggle if I tried to write something like:
arbitrary_string = "hello world" if "hello" in count_words(arbitrary_string): print("Greetings found!")
This was net useful in of itself, and even if you have no plans to integrate with mypy I would recommend getting in the habit of using type hints! (Much electronic ink has been spilled about the niceties of writing type signatures as an exercise in thinking more deeply about your interfaces & contracts. I won't rehash those arguments, but rest assured I agree with them.) However, that was the depth of my investment. Back when I did this in ~2019 or so, I looked into actually providing a typecheck step for the Python codebase and was stymied by lack of third-party "stub" support. [^3] That was around eighteen months or so ago, and things have improved significantly since then! On a lark, I decided to pick back up the branch (aptly named mypy-world-domination
) and saw that both django-stubs and boto3-stubs have progressed significantly, to the point where the majority of issues flagged by mypy were not "hey, I have no idea what django-rest-framework
is" but "hey, you're not handling this Optional
correctly."
After some configuration futzing, I was greeted with a veritable wall of mypy
errors:
$ poetry run invoke typecheck utils/functional.py:58: error: Incompatible return value type (got "Dict[Any, Any]", expected "Stream") markdown_rendering/extensions/newtab.py:28: error: "handleMatch" undefined in superclass emails/views/utils.py:99: error: "BulkActionView" has no attribute "model" # ...truncated for concision... api/tests/test_emails.py:31: error: "Client" has no attribute "credentials" api/tests/test_emails.py:34: error: "HttpResponse" has no attribute "data" Found 822 errors in 281 files (checked 887 source files)
822 instances of mypy
telling me something was amiss. It was time to get to work!
Optional unwrapping
The first stumbling block is perhaps the most stereotypical: dealing with optionals. My code now has a lot of the following pattern:
def cancel_premium_subscription(subscriber: Subscriber) -> None: stripe_subscription: Optional[StripeSubscription] = subscriber.stripe_subscription if not stripe_subscription: return stripe_subscription.cancel()
It's fair to say, too, that this is perhaps working as intended. Nulls are the worst thing or whatever that quote is, and I think my codebase suffers from "None-as-control-flow" syndrome a good deal. Part of this has been ameliorated by borrowing some concepts like Result types from other, more mature static environments, but I would love to see some a bit more semantic sugar not unlike the constructs offered by TypeScript or Swift:
const person = {
name: "Jane",
pets: [{
type: "dog",
name: "Spike"
}, {
type: "cat",
name: "Jet",
breed: {
type: "calico",
confidence: 0.8
},
}]
};
// Returns [undefined, "calico"]
console.log(person.pets.map(pet => pet.breed?.type)
Mutable global payloads
In what is probably one of many regrettable architectural decisions, I rely on Django middlewares to handle a lot of things that happen within the lifespan of an API request. This might looks like this:
class CustomDomainRoutingMiddleware(MiddlewareMixin): def process_request(self, request: HttpRequest) -> None: custom_domain: Optional[str] = extract_custom_domain(request) if custom_domain: newsletter = Newsletter.objects.get(domain=custom_domain) request.newsletter_for_subdomain = newsletter
This approach is valid Python, and mostly recommended within Django documentation, but mypy
is not a fan for two reasons:
HttpRequest
which has no concept of a newsletter_for_subdomain
attribute.HttpRequest
object; any subsequent access of request.newsletter_for_subdomain
will also raise warning signs.My suspicion is that the right approach here is to declare an omnibus HttpRequest
subclass with all potential global payloads:
class ButtondownRequest(HttpRequest): newsletter_for_subdomain: Optional[Newsletter] # ... and so on
But when I go through such a process, I run into lots of violations of the Liskov substitution principle. Of the three stumbling blocks listed, this is the one that I would bet has the most obvious (or at least “obvious in retrospect”) solution. One of the trickinesses of migrating to mypy in 2022 is that, while it's easier and more worthwhile than it was in 2020, documentation & war stories are still somewhat scarce.
Type refinement
Django (and thus Buttondown) express foreign key relationships as optionals. For example, I have a Subscriber
model that represents a single email address subscribed to a newsletter. A simplified version of this model looks something like the following:
class Subscriber(models.Model): email_address = EmailField() creation_date = DateTimeField() import_date = DateTimeField(null=True) # blank if was not imported
# Every subscriber corresponds to a single newsletter newsletter = ForeignKey(Newsletter)
# Premium subscribers also exist in Stripe stripe_subscription = ForeignKey(StripeSubscription, null=True)
This is all django-stubs
and mypy
need to get a pretty useful understanding of what a Subscriber
entails; it's got a handful of non-optional fields (such as email_address
and creation_date
and newsletter
) and some optional fields (import_date
and stripe_subscription
). The tricky part here is when you want to express an invariant upon all Subscribers. Let's say I have a cron that filters through all premium subscribers and checks to make sure the backing Stripe subscription isn't cancelled:
def check_premium_subscriptions() -> Iterable[Subscriber]: premium_subscribers = Subscriber.objects.exclude(stripe_subscription=None) for subscriber in premium_subscribers: if subscriber.stripe_subscription.status == 'cancelled': yield subscriber
Sadly, mypy
is not a huge fan of this — subscriber.stripe_subscription
is an Optional[StripeSubscription]
and calling .status
on it is therefore dangerous. You could, I think, persuasively argue that this is solved with something like a Result
type (there's a very interesting Pythonic one here). A more elegant solution, though, and one that closely maps onto TypeScript's approach to nuanced type refinement, would be being able to declare a version of Subscriber that has a StripeSubscription. This issue in of itself is still interesting, though, because it suggests a better way to structure this cron and avoid the refinement entirely — iterating on the subscription rather than the subscriber:
def check_premium_subscriptions() -> Iterable[Subscriber]: subscriptions = StripeSubscription.objects.filter(status='cancelled') for subscription in subscriptions: if subscription.subscriber: yield subscription.subscriber
This kind of forced re-examination of cross-object relationships was a very useful byproduct of driving down my mypy errors, in much the same way that the act of expressing type signatures forces you to think a little more deeply about the contracts & interfaces you're reifying.
Buttondown's Python codebase currently sits at 38,103 lines. Upon the initial run of poetry run typecheck
[^4], mypy reported 822 errors. Ouch. Resolving those errors took me approximately eleven hours to resolve in full. I pulled out the metaphorical banhammer, annotating a file with # mypy: ignore-errors
, only thrice:
mypy
would endorse I ignored the entire thing.That was around two solid engineer-days spread across two weeks (I was doing this whilst traveling, so around an hour or two every day). The work was an even 80/20 split:
Optional[float]
and not a float
!"; "oh, I need to express this as an Iterator
and not an Iterable
!");mypy
start as early as possible. Even if you need to litter your codebase with Any
and # type: ignore
annotations, the sooner you start the better.Yes! As mentioned above, I don't think I'd advise folks in trying to do a "big-bang"-style migration in the manner I did unless your codebase is sufficiently small; because I was working on this branch alongside other feature branches, churn was non-trivial and it would have made more sense to go package-by-package, starting with smaller and more reified interfaces and moving onward. One of the more common cliches about shifting towards type safety, as alluded to earlier, is the concept of "forcing you to think in types". An example of this is something like the below method that I had kicking around:
def send_draft(email, recipient)
"Recipient" is not a proper noun in Buttondown's codebase, and once I started adding types it became obvious that it was a bit of a chimera:
def send_draft(email: Email, recipient: Union[Subscriber, Account, SyntheticSubscriber]) -> None
This need to declare an interface for something that "looks like a person with an email address" led to a number of arcane issues and duct-tape over the years — keeping audit logs of emails Buttondown’s sent to subscribers versus accounts is different, for instance. Just the act of writing out the contract made it much more obvious what the right behavior should be: rather than having an omnibus "send draft" method that tries to handle things differently, I refactored the logic to decompose the 'recipient' into a single email address, giving me something much more simple to reason about:
def send_draft(email: Email, email_address: str) -> None
That being said, "thinking about types" and reifying your interfaces are caviar problems. I like those things, but I (and likely you) am in a position where elegant abstractions are a luxury compared to the value proposition of writing safer code. To that end, I thought I'd end by talking about some specific, real-world (albeit silly!) bugs that mypy revealed for me:
AdminNewScheduledEmailNotifier
— that pings me in Slack whenever a new email is scheduled. I pass in a ScheduledEmail
but mistakenly declared the type as an Email
, which is an object with slightly different properties. Notably, ScheduledEmail
has schedule_date
whereas Email
has publish_date
. mypy
detected this — and found a code branch where I was not getting notified about newly scheduled emails for certain newsletters.List[List[float]]
whereas in fact it was a List[List[Optional[float]]
; this meant that while the Python side of things was fine (dataclasses do not throw if you pass in malformed data) my frontend assumptions of the returned data were not, and as a result mypy actually helped me fix a frontend bug that I had been nigh-unable to reproduce for months.Subscriber.objects.create(user=user)
where user
is not actually an attribute on Subscriber. While this isn't a bug, it's certainly confusing, and can lead to serious issues down the line when I programmatically modify the codebase.I'm writing this post a few weeks after I actually completed and shipped the migration, so as to provide space for a bit of a coda — now that I've actually done the dang thing, what does day-to-day development feel like? The answer is — more of the same, but with an additional guard rail. I'm writing code with very, very few optionals now unless a foreign key is involved, and precommit
lets me know when I've missed an off-ramp somewhere. Plus, I get to write functional pipelines like the following:
# A function that pulls in archived emails from an external source such as WordPress, Hey World, or Tinyletter def execute_online_import( retrieve_urls: Callable[[ArchiveImport], Iterable[str]], convert_response: Callable[[requests.Response], Email], archive_import: ArchiveImport, ) -> List[Email]: urls = retrieve_urls(archive_import) extant_email_subjects = typing.cast( List[str], Email.objects.filter( newsletter=archive_import.newsletter, ).values_list("subject", flat=True), ) pipeline = pipe( requests.get, convert_response, partial(filter_extant_emails, extant_email_subjects=extant_email_subjects), partial(maybe_finalize_email, archive_import=archive_import), ) emails = [pipeline(url) for url in urls] return [email.unwrap() for email in emails if email != Nothing]
Whereas before, Python made it a dangerous proposition to deal with partials and composition in this manner — what if convert_response
doesn't map cleanly onto the arguments of filter_extant_emails
!? — it's now safe.
mypy
, it certainly was an accelerating factor.[^1]: Perhaps a more accurate comparison here would be with Sorbet, a Ruby type checker that sits on top of Ruby. But I am surmising that more people are familiar with TypeScript than with Sorbet, so there you go. [^2]: I've since replaced PyCharm with VSCode. This is for two reasons, neither of which are PyCharm’s fault! VSCode's Vue ecosystem is really robust compared to JetBrains', and I use VSCode at my day job (in the rare occurrences when I code these days), so context-switching is minimal. Still, I heartily recommend PyCharm if you're interested in very, very strong integration with the Python ecosystem. [^3]: "stubs" are a silly name for a useful concept that ideally should not exist. They refer to separately-published sets of type signatures for packages that themselves do not have type signatures. For instance, Django
has made a conscious choice to not yet include type information in their package, so a stubs package — aptly titled django-stubs — consists solely of type signatures for Django itself. [^4]: This is an incantation that may not look familiar. I use poetry for Python dependency management and Invoke for task execution.
Thank you to Sumana Harihareswara for proofreading this essay!