🔙 Throwback Thursday for 2025-02-06

        February 8, 2025

🔙 Throwback Thursday for 2025-02-06
Throwback to January 2025: the costly drama of distillation vs. hyperscale AI models!

                🆕 Full archives available at https://fudge.org
Thank you NexusTek for sponsoring!
🙏 And a big thanks to all our sponsors! 🙏
(Scroll to the end…)
                One headline for a Thursday Throwback caught my eye.

                  Techmeme: Stanford and University of Washington AI researchers claim they trained AI reasoning model s1, distilled from a Gemini 2.0 model, for under $50 in cloud compute (Maxwell Zeff/TechCrunch)
By Maxwell Zeff / TechCrunch. View the full context on Techmeme.            

 Back in the ancient AI times of January 2025, the distillation meme was causing lots of hand wringing but probably cost more than $50 to unseat hyperscaler-size trained models.

                  Techmeme: The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks (Kyle Wiggers/TechCrunch)
By Kyle Wiggers / TechCrunch. View the full context on Techmeme.            

                    Want to read the full issue?

                        Your first name (required)

            Email address (required)

Hot Fudge Daily

🔙 Throwback Thursday for 2025-02-06

Throwback to January 2025: the costly drama of distillation vs. hyperscale AI models!

Techmeme: Stanford and University of Washington AI researchers claim they trained AI reasoning model s1, distilled from a Gemini 2.0 model, for under $50 in cloud compute (Maxwell Zeff/TechCrunch)

By Maxwell Zeff / TechCrunch. View the full context on Techmeme.

Techmeme: The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks (Kyle Wiggers/TechCrunch)

By Kyle Wiggers / TechCrunch. View the full context on Techmeme.