Hi there, it's me, Matt.
Before I dive into the essay, some personal news: today is my first day as director of design at Simple Health. Simple Health is a direct-to-consumer health company focused on affordable and accessible products. I'm really excited to my put time and energy into an important mission.
Designers need to take a hard look at our relationship with metrics. Are we measuring the right thing? What happens when we measure the wrong thing? In this essay, some thoughts on the latter question.
As usual, you can also read on my website.
The attention economy is booming. While advertising, e-commerce, and social media are dominated by monopolies, the market for our attention is flush with competition. New movie studios, VR platforms, streaming services, and merchandisable cinematic universes are launched every day. They win Emmys. They win Oscars. They win Grammys and Golden Globes. But tech’s push into entertainment isn’t about awards. It’s about engagement.
As a word, engagement is meaningless. It’s a stand-in for any number of measurements collected by the tracking technology embedded into every website, app, and TV. For Google, engagement might mean the number of times an ad is seen. To Instagram, engagement is probably a complex formula comprising likes, comments, and follows. Netflix could count the number of hours the average person spends watching Friends as engagement.
A meaningless word, but a meaningful measurement; engagement drives algorithms. So engagement is closely monitored. Right now, product managers, designers, and engineers are planning, building, and shipping to drive more engagement. Those measurements feed the formulae that guide what shows we watch, what ads we see, and what products we buy.
The cycle of engagement feeds itself at a user’s expense. YouTube’s engagement numbers — hours watched, logged-in users — are skyrocketing. Last May, CEO Susan Wojcicki proudly announced that people watch more than 250 million hours of YouTube every day on their TV screens alone.1 A month later, Kevin Roose reported a story in the New York Times of a college dropout radicalized by YouTube’s recommendation algorithm. He was “pulled into a far-right universe, watching thousands of videos filled with conspiracy theories, misogyny and racism.”2
In 2017, Netflix flexed its muscle as an independent content producer. It debuted award-winning shows like Glow, Dark, 13 Reasons Why, Atypical, and Mindhunter. It delivered new seasons of critically acclaimed series like House of Cards, Orange Is the New Black, and Stranger Things. As of April 1st, 2017, Netflix was collecting over 27 million dollars in subscription fees every single day. To grow more, CEO Reed Hastings said, Netflix needed people to sleep less:
The market is just so vast. You know, think about it, when you watch a show from Netflix and you get addicted to it, you stay up late at night. You really — we’re competing with sleep, on the margin.3
Netflix got its wish. An analysis of National Health Interview Survey data from 2010 to 2018 shows that sleep deficiency increased “significantly” — from 30.9% in 2010 to 35.6% in 2018.4
In 2016, Facebook was caught in its own engagement spiral. A group of small advertisers sued, claiming Facebook had given them misleading video metrics. Indeed, Facebook was overstating the amount of time people spent watching videos on their apps. The magnitude varies widely depending on who’s counting — Facebook says they only inflated numbers by 60% to 80%, while the plaintiffs put the inflation at 150% to 900%. As the lawsuit progressed, internal documents showed that Facebook knew about the reporting “error” and did nothing to fix it.5 Facebook settled the lawsuit in 2019 for $40 million. Its advertising revenue that year was $16.6 billion.6
A pattern is a common approach to solving a problem that is proven to work in practice. Conversely, an antipattern is a common approach to solving a problem that leaves us worse off than when we started. — Scott Ambler, Reuse Patterns and Antipatterns7
An antipattern is a solution to a common problem that seems obvious at first, but turns out to be wrong.
For example: premature optimization is an antipattern. When building a new product, it’s easy to imagine all the potential use cases and future challenges. Solving those problems and preparing your product for success at scale isn’t necessarily a bad thing. But doing too much too soon can slow you down.
An antimetric is a common measurement that leads us to make things worse than they already are.
Engagement is an antimetric. Measuring engagement has led Facebook, Netflix, and YouTube into the moral hazards outlined above. Each risked the health, well-being, and trust of their users and customers to increase engagement. In each case, engagement turned out to be a poor proxy for user value.
Just because someone watches more videos on YouTube doesn’t mean they are getting more value out of their interactions with the website. Just because users scroll further down their newsfeeds doesn’t mean they are making more meaningful connections with their friends and families.
Google wanted people to use Gmail more, so it measured engagement:
The Gmail team wanted to understand more about the level of engagement of their users. With the reasoning that engaged users should check their email account regularly … our chosen metric was the percentage of active users who visited the product on five or more days during the last week. We also found that this was strongly predictive of longer-term retention, and therefore could be used as a bellwether for that metric.8
What Google wanted to measure was long-term retention, a sign that users trust Gmail with their private communication. But long-term retention can’t be measured directly or tracked neatly on a dashboard, so Google measured their users’ habits instead.
An email habit might be a bellwether for Gmail’s retention numbers, but it’s also a predictor of stress and anxiety: a 2015 study showed that reducing email usage reduced stress, and in turn led to greater well-being.9 When a measurement is at odds with your users’ happiness, it’s an antimetric.
There’s an application that’s been on my phone ever since I first installed it nearly eight years ago. I use it every day, although the company’s metrics probably don’t reflect that. It makes my life better even by sitting, unopened, in a folder. I’ve never paid for it, and it’s never shown me an ad.
It’s called Dark Sky. It’s a weather app.
It works quietly in the background, diligently checking the forecast. Every once in a while — once a week, if I’m lucky — it sends a notification. “Light rain starting in 10 minutes.” I usually don’t even tap on the notification. But I’ll grab an umbrella on my way out the door. “Rain stopping in 5 minutes.” I’ll wait to catch the next train. “Drizzle starting soon!” I’ll hurry up and walk the dog.
Dark Sky has the highest value-to-engagement ratio of any product I’ve ever used. I rely on it. There’s a correlation between that ratio and reliance, and I don’t think it’s a fluke.
When tech companies stop chasing engagement and start focusing on value, we’ll be able to forge a healthy co-existence with our apps, devices, games, and screens. Until then, we’re at the mercy of an antimetric.
Todd Spangler, “YouTube Now Has 2 Billion Monthly Users, Who Watch 250 Million Hours on TV Screens Daily,” Variety, published May 3, 2019, https://variety.com/2019/digital/news/youtube-2-billion-users-tv-screen-watch-time-hours-1203204267/. ↩
Kevin Roose, “The Making of a YouTube Radical,” The New York Times, published June 8, 2019, https://www.nytimes.com/interactive/2019/06/08/technology/youtube-radical.html. ↩
Peter Kafka, “Amazon? HBO? Netflix thinks its real competitor is... sleep,” Vox, published April 17, 2017, https://www.vox.com/2017/4/17/15334122/netflix-sleep-competitor-amazon-hbo. ↩
Jagdish Khubchandani and James H. Price, “Short Sleep Duration in Working American Adults, 2010–2018,” Journal of Community Health, published September 5, 2019, https://link.springer.com/article/10.1007/s10900-019-00731-9. ↩
Suzanne Vranica, “Advertisers Allege Facebook Failed to Disclose Key Metric Error for More Than a Year,” The Wall Street Journal, last updated October 16, 2018, https://www.wsj.com/articles/advertisers-allege-facebook-failed-to-disclose-key-metric-error-for-more-than-a-year-1539720524. ↩
“Facebook Reports Second Quarter 2019 Results,” Facebook, published July 24, 2019, https://investor.fb.com/investor-news/press-release-details/2019/Facebook-Reports-Second-Quarter-2019-Results/default.aspx. ↩
Scott Ambler, “Reuse Patterns and Antipatterns,” Dr. Dobb’s, published February 1, 2000, https://www.drdobbs.com/reuse-patterns-and-antipatterns/184414576?cid=Ambysoft. ↩
Kerry Rodden, Hilary Hutchinson, and Xin Fu, “Measuring the User Experience on a Large Scale: User-Centered Metrics for Web Applications,” Google, accessed February 5, 2020, https://storage.googleapis.com/pub-tools-public-publication-data/pdf/36299.pdf. ↩
Kostadin Kushlev and Elizabeth W. Dunn, “Checking email less frequently reduces stress,” Computers in Human Behavior, Volume 43, February 2015, https://www.sciencedirect.com/science/article/pii/S0747563214005810. ↩