This Week in Responsible AI

Subscribe
Archives
December 28, 2023

This Week in Responsible AI: Dec 28, 2023

This Week in Responsible AI: Dec 28, 2023

Trustworthy AI

  • 'Safeguards like filters, system prompts, and rejection sampling mechanisms can often be disabled by changing a few lines of code. Fine-tuning is also relatively trivial to reverse; one study also showed that fine-tuning for various Llama models could be undone for under $200.'
  • Evaluation & Hallucination Detection for Abstractive Summaries
  • On Specifying for Trustworthiness
  • "a plausible explainer, i.e, one that predicts accurately, does not necessarily produce faithful explanations, and vice versa."

Law / Policy

  • 'In the case of LLM outputs, there is neither a speaker, nor communication of any message, nor any meaning that is not supplied by the text recipient. I conclude that LLM texts cannot be considered protected speech, which vastly simplifies their status under defamation law.'
  • UN Interim Report: Governing AI for Humanity
  • New York Times sues Microsoft, ChatGPT maker OpenAI over copyright infringement

Bias

  • Data Bias Management
  • A Call For A Systemic Dismantling: These Women Refuse To Be Hidden Figures In The Development Of AI

Other

  • The Year That A.I. Came for Culture
  • Why Are Lawyers Afraid of AI?

Compiled by Leif Hancox-Li

Don't miss what's next. Subscribe to This Week in Responsible AI:
Powered by Buttondown, the easiest way to start and grow your newsletter.