April-May 2023 Updates
Hi everyone,
New posts:
- After years of dismissing AI alignment as a bs field, I've come to think that it's finally into real science (and we should really start making progress on it soon.): AI Alignment Is Turning from Alchemy Into Chemistry
- A simple two-sentence prompt can't possibly jailbreak both GPT-4 and Claude, can it? And even if it did, surely, it would be easy to patch the vulnerability that made it work, right? Well… A Two sentence Jailbreak for GPT-4 and Claude & Why Nobody Knows How to Fix It
Links
- Statement on AI Risk by
Dan Hendrycks's Center for AI Safety: signed by Demis Hassabis, Sam Altman, Dario Amodei, Bill Gates, etc.
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.
- Yudkowsky abandons alignment research 🙀
- Sam Altman's Productivity
- “Widlar…engaged in office-destroying outbursts so prolonged and explosive that Valentine describes him as “certifiable.”…[when] calm, he designed analog circuits that…accounted for ~75% of the worldwide market.” “Bob steeled me from trying to homogenize great technical genius”
Have a great rest of your month & as always feel free to reach out!
Stay frosty,
Alexey (...and a note from the 19yo Alexey on what's most important long-term)
Don't miss what's next. Subscribe to Alexey Guzey updates: