ChatGPT released their latest model last week: GPT-5. Beth spent a few days checking up on what has (or hasn't) improved when it comes to using it for code development.
She says:
I spent the weekend testing out GPT-5 with Cursor. The long and short, I was not impressed.
Previous versions were able to parse images more efficiently. I provided a new colour palette for my app but the tool could not identify if it was or was not using the right colour at least 40% of the time. I finally asked it if the icon colours were consistent and it found the issue. When its fix didn’t work, it provided me with the new code so I could provide a more precise prompt.
It successfully added a sidebar without issue but still can’t sort out tabs. Seriously, why are tabs hard? When I provide a specific framework and colour scheme, it still deviates which means I have to use additional tokens to get it back in check. From a coding point of view it was pretty lackluster.
The last thing I’ll say is this, it appeared that GPT-5 timed out on me a few times over the weekend. Although it’s hard to tell if it was Cursor or GPT-5. The lack of messaging when the tools go down or lag is starting to feel unacceptable to me. We are on version 5.
I would expect the product teams to have a sense of usage, have implemented appropriate messaging, incident response, and understand their concurrent use increases the first weekend after a big announcement. Maybe I’ve worked in DevOps for too long but as the prices go up there is an expectation that scalability and reliability will also increase.
All that to say, I didn’t find it a more effective use of tokens, the same issues exist from previous versions. As Public Enemy says, “Don’t believe the hype.”
Here's what things looked like before and after.
Have you tried GPT-5 yet? Let us know your thoughts.
See you next time!