This Week in Responsible AI: Dec 28, 2023

                December 28, 2023

            This Week in Responsible AI: Dec 28, 2023

            This Week in Responsible AI: Dec 28, 2023
Trustworthy AI

'Safeguards like filters, system prompts, and rejection sampling mechanisms can often be disabled by changing a few lines of code. Fine-tuning is also relatively trivial to reverse; one study also showed that fine-tuning for various Llama models could be undone for under $200.'

Evaluation & Hallucination Detection for Abstractive Summaries

On Specifying for Trustworthiness

"a plausible explainer, i.e, one that predicts accurately, does not necessarily produce faithful explanations, and vice versa."

Law / Policy

'In the case of LLM outputs, there is neither a speaker, nor communication of any message, nor any meaning that is not supplied by the text recipient. I conclude that LLM texts cannot be considered protected speech, which vastly simplifies their status under defamation law.'

UN Interim Report: Governing AI for Humanity

New York Times sues Microsoft, ChatGPT maker OpenAI over copyright infringement

Bias

Data Bias Management

A Call For A Systemic Dismantling: These Women Refuse To Be Hidden Figures In The Development Of AI

Other

The Year That A.I. Came for Culture

Why Are Lawyers Afraid of AI?

            Compiled by Leif Hancox-Li

Don't miss what's next. Subscribe to This Week in Responsible AI: