John's Data and Analytics Weekly

Subscribe
Archives
April 3, 2022

Will Airflow Win, SQL vs Everything, 20 years of programming

  • My favorite post of the week is Alex Ewerlöf’s guiding principles after 20 years of programming. Some of the most resonant:

    • Write code not for the machines but for colleagues, including the junior ones.
    • Deprecate yourself. Don’t be the go-to person for the code.
    • Any significant and rewarding piece of software is the result of collaboration.
  • Criticizing is fast and easy. Creating is slow and difficult.

  • A provocative article on why Deep Learning on Electronic Medical Records is doomed to fail. (We’re still doing it).

  • Data Warehouses vs Data Lakes, SQL vs everything else, on this excellent a16z podcast on The Great Data Debate (originally aired in 2020).

  • Another post from a16z on Emerging Architectures for Modern Data Infrastructure is especially relevant for work our team is engaged in right now. What I found most thought-provoking:

    • The emerging concept of a “metrics layer, a system providing a standard set of definitions on top of the data warehouse”
    • Formalizing the concept of data platforms
    • What was missing from their discussion (security, APIs/Access layers)
  • Seattle Data Guy took a poll and asked Will Airflow Win The Orchestration Race? Spoiler: our favorite Prefect came in a distant second.

  • From our neighbors at UNC, Project Oasis is an interactive and downloadable database of “locally focused digital news publications in the U.S. and Canada.”

  • Fifteen simple language-agnostic, actionable tips on REST API design.

  • What’s the right way to handle missing geolocation data? Soul Buoy on Null Island.

  • A nice discussion of pointers in Python. Turns out growing up on C has it’s benefits.

Don't miss what's next. Subscribe to John's Data and Analytics Weekly:
Powered by Buttondown, the easiest way to start and grow your newsletter.