Evaluation & AI Reality Check: Newsletter #3
This is the third instalment of my newsletter. I have been intensely focused on evaluating models for various tasks. There are loads of evaluation models out there, and I have gone through several of them. It is an evolving field, and let’s just say we are not there yet.
Hong Kong Fintech
The plan was to attend the Hong Kong Fintech Festival today, Monday, 28/10/2024.
I will arrive late at the event on Monday afternoon due to a tropical storm hitting Da Nang.
I hope to catch up on the second day, at least.
You can connect via Telegram to meet or talk to me.
Evaluating LLM models
It helps to rank models next to each other. But if you deep-dive into the tests, they don't mean that much. Let's say, "Knowing facts is not the same as understanding them. ” Here is my first article on the topic. I am working on evaluation models for programming next.
Some news that caught my eye
Read an excellent article comparing the dot-com bubble and the AI bubble. Lots of people are trying to caution against the too-optimistic outlooks for AI. In the end, will these massive investments where we are building nuclear power plants and big data centres pay off?
As you can see, I am talking about investments. AI will have use cases and benefits in our daily lives. But be smart about where to spend your money and consider the “realistic” ROI.
Let’s put our tinfoil hat on. The internet boomed with Google, Facebook, Twitter, Linkedin, and other social media companies. Their business model is selling your data to advertisers. Companies like OpenAI use copyrighted data, scrape the internet and social media and train their models. What would they be excellent at? Surveillance, of course! At least in China, you know this is happening. Will it be as transparent?