May 23, 2025, 9 a.m.

the robots are not taking over: why AI is not replacing data analysts

The Pivot Pages

Last week, I tried to replace myself with AI. I faced a long list of data analysis projects decided to do my best to get them all done with AI instead of writing the SQL and python myself. Here’s how it went.

A promising start

The project: updating an existing dashboard, which is built in SQL and python. I had two main tasks:

  1. Transform a column containing dates to instead display the day(s) of the week (e.g., swap “5/6/25, 5/7/25” for “Tuesday, Wednesday”)

  2. Parse two lists of aggregate text into separate columns for sorting and filtering (e.g., transform the text block “45% female, 35% male, 5% nonbinary, 15% unknown” into 4 columns with each percentage as a number).

I opened ChatGPT and started with: “Write a python function that takes multiple dates and returns a list of the weekdays represented.” This gave me a handy function that did just that — yay!

The first hurdle

Except… it’s never that simple. After applying that function to my column, I realized the logic for selecting those dates needed updating — we needed to grab not just the most recent date, but all dates that fell in the last week, or if there were none, the latest date before that. I asked ChatGPT to write me a SQL function that did this. It first suggested a SQL function (array_agg) that didn’t work with my database. I asked it to help debug the error and it suggested a different function (listagg), which I successfully swapped in.

Next, I realized my function worked great for individual dates (e.g., 5/6/25 to “Tuesday”) but for lists of multiple dates, it returned “invalid date”. I started typing out a prompt to explain what it was doing wrong and what the correct result would be. Halfway through writing that prompt, I realized it would be faster to update the function manually, to split up lists of dates and nest a for-loop to handle them neatly.

The bigger challenge

Next, I asked ChatGPT to write a python function that takes the aggregate text block and returns a table with each piece of information in its own column. It wrote a very helpful function to parse the regular expression, that did just that. This part definitely saved me some time. I made a few tweaks to save this information to the same dataframe instead of creating a new one, and boom — done.

Next, I had a similar task. In this column, the information was flipped: the categories were first, and the values were second, like this: “Female: 45%, Male: 35%, Nonbinary: 5%, Unknown 15%”. I asked ChatGPT to write a similar function to handle this column. It did so, but the values in the resulting split columns were all zeroes. After reading the code it produced, I noticed it failed to flip the keys and values from the previous function. This was a simple manual fix in the code, and another example of a case where identifying the issue to explain what to debug in a prompt requires either more time to write, more background knowledge to understand and address, or both.

Are robots taking over?

At the end of all of this, I cleaned up the dashboard to make the columns easy to read, reduce scrolling, and remove unnecessary columns, because AI couldn’t do that part for me.

I estimate that AI made me about 5-10% faster at this set of tasks, with most of the time savings coming from the function to parse the text blocks correctly. Which, as a person with a long to-do list, I appreciate! That’s probably a similar performance boost as learning to use Excel formulas instead of using Excel without them. It’s great to have AI as another tool, but this experience reinforced for me that new skills are needed to use it effectively:

  • Provide a clear prompt, along with sample input and an example of the desired output

  • Give sufficient context — if I’d done this in my second prompt, I wouldnt’ve gotten a result that didn’t work with my specific database

  • Understand error messages enough to either resolve the error or craft additional prompts to describe the error effectively enough to get a resolution

For anyone learning new data skills right now, will your learning journey be different than anyone who learned before AI? Absolutely. You’ll need to do a little less memorizing of exact syntax and function names. But in order to analyze data with the help of AI tools, you’ll need to do just as much, if not more, learning about how a database works, how to troubleshoot errors, and how to think systematically. Thankfully, those topics are a lot more interesting — but we need learning materials focused more deeply on them. More to come on this, and if you know of great resources for learning these topics, send ‘em my way.

You just read issue #3 of The Pivot Pages. You can also browse the full archives of this newsletter.

Share via email Share on Bluesky
Powered by Buttondown, the easiest way to start and grow your newsletter.