April-May 2023 Updates

Center for AI Safety:

                June 11, 2023

            April-May 2023 Updates

            Hi everyone,
New posts:

After years of dismissing AI alignment as a bs field, I've come to think that it's finally into real science (and we should really start making progress on it soon.): AI Alignment Is Turning from Alchemy Into Chemistry
A simple two-sentence prompt can't possibly jailbreak both GPT-4 and Claude, can it? And even if it did, surely, it would be easy to patch the vulnerability that made it work, right? Well… A Two sentence Jailbreak for GPT-4 and Claude & Why Nobody Knows How to Fix It

Links

Statement on AI Risk by 
Dan Hendrycks's Center for AI Safety: signed by Demis Hassabis, Sam Altman, Dario Amodei, Bill Gates, etc.
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

Yudkowsky abandons alignment research 🙀
Sam Altman's Productivity
“Widlar…engaged in office-destroying outbursts so prolonged and explosive that Valentine describes him as “certifiable.”…[when] calm, he designed analog circuits that…accounted for ~75% of the worldwide market.” “Bob steeled me from trying to homogenize great technical genius”

Have a great rest of your month & as always feel free to reach out!
Stay frosty,

Alexey (...and a note from the 19yo Alexey on what's most important long-term)

Don't miss what's next. Subscribe to Alexey Guzey updates: