π Throwback Thursday for 2025-02-06
Throwback to January 2025: the costly drama of distillation vs. hyperscale AI models!
π Full archives available at https://fudge.org
Thank you NexusTek for sponsoring!
π And a big thanks to all our sponsors! π
(Scroll to the endβ¦)
One headline for a Thursday Throwback caught my eye.
Techmeme: Stanford and University of Washington AI researchers claim they trained AI reasoning model s1, distilled from a Gemini 2.0 model, for under $50 in cloud compute (Maxwell Zeff/TechCrunch)
By Maxwell Zeff / TechCrunch. View the full context on Techmeme.
Back in the ancient AI times of January 2025, the distillation meme was causing lots of hand wringing but probably cost more than $50 to unseat hyperscaler-size trained models.
Techmeme: The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks (Kyle Wiggers/TechCrunch)
By Kyle Wiggers / TechCrunch. View the full context on Techmeme.
Want to read the full issue?