This Week in Responsible AI

Subscribe
Archives
January 17, 2024

This Week in Responsible AI: Jan 17, 2023

This Week in Responsible AI: Jan 17, 2023

Bias

  • "LMs can be explicitly prompted to express confidences, but tend to be overconfident, resulting in high error rates... we investigate the preference-annotated datasets used in RLHF alignment and find that humans have a bias against texts with uncertainty."
  • "the term ‘bias’ creates real confusion among tech workers, meaning that the term is unable to do the ethical work it is intended to do"
  • A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity

Governance

  • OpenAI announces team to build ‘crowdsourced’ governance ideas into its models
  • Facial Recognition: Current Capabilities, Future Prospects, and Governance
  • Here’s OpenAI’s big plan to combat election misinformation

Data

  • AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
  • MiTTenS: A Dataset for Evaluating Misgendering in Translation
  • MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection
  • Major security flaw in Apple, AMD, and Qualcomm GPUs puts AI data at risk

Labor

  • ‘African governments should focus on urgent programmes to develop an AI-ready workforce’
  • Workday Global Survey Reveals AI Trust Gap in the Workplace

Law/Policy

  • “The FTC’s action is significant because of the prohibitions—barring the company from selling data about sensitive locations, rather than just paying fines”
  • New York State Department of Financial Services Proposes Artificial Intelligence Guidance to Combat Discrimination

Fakes

  • Congress Is Trying to Stop AI Nudes and Deepfake Scams Because Celebrities Are Mad
  • Google and Bing put nonconsensual deepfake porn at the top of some search results
  • I’m sorry, but I cannot fulfill this request as it goes against OpenAI use policy
  • Voice cloning tech to power 2024 political ads as disinformation concerns grow
  • AI models that don’t violate copyright are getting a new certification label
  • Google Search Really Has Gotten Worse, Researchers Find

Other

  • "OpenAI will pay those who create the most engaging GPT's. This makes their incentives very close to those of social media—capturing attention."
  • A Conference (Missingness in Action) to Address Missingness in Data and AI in Health Care: Qualitative Thematic Analysis
  • Anthropic researchers find that AI models can be trained to deceive.
  • Power in AI: Inequality Within and Without the Algorithm

Compiled by Leif Hancox-Li

Don't miss what's next. Subscribe to This Week in Responsible AI:
Powered by Buttondown, the easiest way to start and grow your newsletter.