In Praise of the Ephemeral

By Emily

                June 10, 2024

            In Praise of the Ephemeral

            Piles of Data Don't Add Up to AI, But They Do Add Up to Trouble
Sunset in the San Juan Islands, photo copyright Emily M. Bender 2011
By Emily
The past couple of weeks have brought a few stories that bring home the toxicity of passive data collection, whether the accumulated swamp of data lives in corporate hands or stays local to the data subject's own devices. I want to highlight and reflect on two of them here.
Fast Company reported on CollegeVine, a company founded as a professional network that allowed prospective college students to connect with admissions offices, now launching an "AI recruiter". Apparently, the problem CollegeVine is now seeking to solve is that the college admissions offices are overwhelmed with student queries. Wouldn't it be nice if some automated system could take care of some of them?
FastCompany asked Zack Perkins (founder of CollegeVine): "What were some of the most challenging questions that came up when you were building the recruiter?" and Perkins replied:

There were a number. On the student side, some of the big ones were: Should we say this is an AI? Should it feel like an AI? We landed on: Let’s be very upfront that this is an AI—and students actually do prefer the AI recruiter.

(It's surprising and telling to me that this is framed as a challenging question. I wonder what was in the "cons" column regarding transparency here. Also, while it's important to be upfront about the system being artificial, to let the students know they're not communicating directly with a person, I maintain it's misleading to call these things "AI"...)
There are many issues with this whole set up (FastCompany noted that the assistant they tested refused to answer questions "about campus administrators’ stance on Palestine and Israel" and that Perkins said it's because the system is "not allowed to discuss politics"), but the one I want to focus on here is what happens as data accumulates in these systems.
Perkins states that students are more comfortable in some circumstances with the automated system:

What we found is that early in the journey, students prefer the AI instead of a human because it’s an impartial third party that’s not judging them. They ask way more honest and genuine questions. One of the most common is something like: I have a 2.7 GPA—can I get in?

And that they're doing some kind of emotion detection (this is always a red flag):

There’s also layers of AI: We have fact-checking to make sure all the information is correct, and if the student seems stressed or concerned, that’s flagged and goes to the admissions team.

When FastCompany asked "What does the AI recruiter keep track of? Could a bad interaction with the AI impact whether or not a student gets in?" Perkins replied:

All the information the AI recruiter learns about a student can go into a student’s profile that the college has, but colleges today have a lot of restrictions on what they can actually use in the admissions decision, especially coming out of the recent Supreme Court case.

So in other words, we've got a set up where students are sharing more with an automated system than they'd feel comfortable sharing with a person; the extent to which their interactions are preserved (and subjected to automated analysis) is almost certainly not adequately conveyed; and the company's position on the question of how might this impact student admissions is that we should rely on admissions officers to follow the law. First off, that assumes that all relevant cases are covered under the laws alluded to. Secondly, the best way to ensure that things you're not allowed to consider don't impact decisions is to avoid (to the extent possible) gathering that information in the first place--and lulling students into sharing more than they otherwise would is the opposite of that.
The other example I wanted to cover quickly in this newsletter is Microsoft's recent "Recall" disaster. As reported in WIRED, this system (to launch on Windows 11 devices on June 18, according to June 8 reporting), takes a screenshot of the user's screen every five seconds and saves that data locally. The ostensible purpose of this massive surveillance is to allow users to do things like "search for recipes you’ve looked at online but whose websites you’ve forgotten."
This so-called "AI-assisted" feature of course isn't AI. Like everything else sold as such these days, it's statistical processing of lots and lots of data. But do that data processing, you need to collect the data. And once the data is collected, it becomes an enormous liability. WIRED reports:

Included in what the database captures are screenshots of whatever is on your desktop—a potential gold mine for criminal hackers or domestic abusers who may physically access their victim’s device. Images include captures of messages sent on encrypted messaging apps Signal and WhatsApp, and remain in the captures regardless of whether disappearing messages are turned on in the apps.

This disaster feature was initially going to be set to on by default, though *Security Week* reports that the backlash from security and privacy experts led Microsoft to reverse that decision. It's great that the backlash was swift and loud. It would be nice if we didn't have to constantly be on guard for nightmarish security decisions by Big Tech companies whose products are virtually impossible to avoid. Microsoft's blog post about this "feature" ends with some boilerplate about their Responsible AI Standard. It seems like if the folks developing and shipping this feature actually thought it through for like five minutes, they'd see the issues with the "Reliability and Safety" and "Privacy and Security" of such a system.
For the rest of us, this serves as another important object lesson about resisting data collection. One of the costs of today's so-called "AI systems" is the sheer amount of data that we are asked to surrender. Even when it doesn't cost us anything (in this case, other than disk space) to create and store the data, we need to be sensitive to the costs of the data persisting, whether on our own hard drives or in the cloud.
To be ephemeral is to be lost almost immediately, but what do we call it when we lose the ability to let things be ephemeral?

Our book, The AI Con, is now available wherever fine books are sold!

Don't miss what's next. Subscribe to Mystery AI Hype Theater 3000: The Newsletter: