One Shot Learning #8: This one is sincere and kind of meta.
One Shot Learning
#8: This one is sincere and kind of meta.
Careers take very long arcs. I'm not thrilled with the "kids these days" rhetoric -- does the chapter close with a note on avocado toast? -- but in a chapter from Web Operations Theo Schlossnagle writes:
Generation X (and even more so generation Y) are cultures of immediate gratification. I’ve worked with a staggering number of engineers that expect the “career path” to take them to the highest ranks of the engineering group inside 5 years just because they are smart. This is simply impossible in the staggering numbers I’ve witnessed. Not everyone can be senior. If, after five years, you are senior, are you at the peak of your game? After five more years will you not have accrued more invaluable experience? What then? “Super engineer”? Five more years? “Super-duper engineer.” I blame the youth of our discipline for this affliction. The truth is that there are very few engineers that have been in the field of web operations for fifteen years. Given the dynamics of our industry many elected to move on to managerial positions or risk an entrepreneurial run at things.
It feels as though our little discipline of data science is in a similar place, 9 years after these words were first written. Sure, statisticians and physicists and Peter Norvig all existed long before the 21st century, and Xerox PARC and Bell Labs employed many famous researchers, from Alan Kay to Yoshua Bengio. These folks exemplify the exception, though -- most of us are not industrial researchers, instead spending our days engaged in data cleaning, product development, project management, analytics, and software engineering, with brief respites of interesting applied scientific work mixed in. I suspect that this work did not exist before Marc Andreessen informed us that software was eating the world a little under a decade ago.
In a previous issue I alluded to my first role at a technology company. It had nothing to do with data work, at least superficially.
One of my first roles in 2011 carried responsibilities closer to “junior site reliability engineer.” I mean, I maintained a Graphite cluster.
Sure, I worked as an analyst for several years before this, and before joining full-time I interned as a junior MLE for four months. Interning was a fascinating experience! You can read about it in my project-recapping post from 2012! (Also, I got my dates flipped in that newsletter. I started my devops work in the summer of 2012. I regret the error, dear reader.)
But the idea that an employer would perpetually fund the research projects of a 26-year old guy changing careers while studying part-time was a pipe dream. I wasn't employed by Google Brain; I had not completed years of research at a top academic institution. Enlightened managers know to task their interns with interesting and stimulating work, but that advice does not extend to recently-hired interns. So that's how I ended up maintaining the Graphite cluster at my first tech startup.
To be fair, maintaining this cluster was an opportunity to provide some value to this company. We had recently signed deals with strict real-time service level agreements, and meeting those SLAs required understanding the performance characteristics of our services. So Graphite it was, deployed on a cluster managed by Ganglia (eventually migrated to AWS CloudFormation) and persisting data on a Whisper RRD while reporting metrics in a little dashboard that makes matplotlib look Tufte-esque. Eventually I moved onto data work again: some software, a few prototyped models, just trying to deliver wherever I could.
Reading tweets from Justin Gage and Ethan Rosenthal a few weeks ago reminded me of this experience:
Multiple Data Scientist friends of mine are unhappy at their jobs because they're unable to meaningfully impact the business, or they report into people who don't understand their path to impact https://t.co/NWq2Tck4X5
— Justin Gage (@jGage718) June 4, 2019
A thought I've had lately: companies hire a bunch of DS, they build the first version of critical DS things (recommendations, forecasting, etc...), but then the ROI on improving those things is too small due to small scale of the company, so the DS are left lacking impact.
— Ethan Rosenthal (@eprosenthal) June 4, 2019
These companies described by Ethan and Justin do not need data scientists beyond a certain point. Technical projects end, while businesses are bottlenecked by non-technical problems: they stop growing their user base, they hire managers who underperform, or their marketing funnel leaks, or sales teams underperform their targets, or a major player consumes your addressable market.
If software cannot solve these problems, what the hell can data science or analytics do? Recommend new sales targets? Provide insight into operational inefficiencies? In some cases, sure! But it's hard to imagine a fulfilled hierarchy of needs for data work at a business exhibiting these fundamental issues.
The fact of the matter is, data science is a fraction of software, which is a fraction of business. Before a company launches a product driven by machine learning algorithms, they usually launch a product driven by business rules. This holds for both a tiny, ambitious startup and Google:
Machine learning is cool, but it requires data. Theoretically, you can take data from a different problem and then tweak the model for a new product, but this will likely underperform basic heuristics. If you think that machine learning will give you a 100% boost, then a heuristic will get you 50% of the way there.
Before the company builds a product, it makes operational decisions, or targets business partners, or raises money to fund the venture's initial activities. It takes a long time before the company starts translating business logic into software, and then translating that into data-driven algorithms--if the business even needs it. Back to the Rules of Machine Learning:
Before formalizing what your machine learning system will do, track as much as possible in your current system. Do this for the following reasons: (...) You will notice what things change and what stays the same. For instance, suppose you want to directly optimize one-day active users. However, during your early manipulations of the system, you may notice that dramatic alterations of the user experience don’t noticeably change this metric.
To this point, and Ethan's earlier tweet, the first version of your model may be good enough. You may have coded yourself out of data work, if not a job altogether! In that case, moving onto a non-data project may be the most valuable action you can take at this company.
Writing this week’s issue has been relieving, maybe a bit therapeutic. I can articulate a fact about our work, and I can admit that I cannot propose great solutions to these situations. I'll instead end with another story.
At work these days I am especially engaged in the 90% of typical data science work that does not involve applied science or analytics. Someone asked me about this recently, roughly as follows:
"So you're just writing software and managing an integration?"
I try not to fall prey to the "affliction" Theo described earlier. Careers are long, and I'm lucky my past experience has positioned me to try and tackle some valuable non-data work right now. I do not know if that's ideal, but at least it's useful.