January 13, 2024
January 13, 2024
Our AppSec team is sharing things they wish they'd known before using Semgrep. Follow our seven-step plan for bootstrapping this static analysis tool, which is supported in over 30 programming languages. https://t.co/XXJE1QzBJX
— Trail of Bits (@trailofbits) January 12, 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
— Tanishq Mathew Abraham, Ph.D. (@iScienceLuvr) January 12, 2024
abs: https://t.co/wNneQxSSeF
This @AnthropicAI paper demonstrates the training of backdoored models that, when triggered, involve switching from writing safe code to inserting code vulnerabilities.… pic.twitter.com/ApzY95BTCC
Why does putting this invisible Unicode garbage into the LLM even work? Tokenizers.
— Rich Harang (@rharang) January 12, 2024
When the LLM gets it, the tokenizer splits the mangled text into the 'tag' characters and the original character. You end up with a sequence of 'tags-token-tags-token-tags-token' token ids. 1/ https://t.co/nlQtCCUiVy
I revise my earlier tweet about this being the best prompt injection content in a few weeks. This is the biggest breakthrough and security issue since prompt injection itself.
— Joseph Thacker (@rez0__) January 11, 2024
Here's why:
- It's invisible
- It's near impossible to fix
That's only 2 things but the fact it is… https://t.co/cdHDa9ZIcw
PoC: LLM prompt injection via invisible instructions in pasted text pic.twitter.com/AY9HLzT2zB
— Riley Goodside (@goodside) January 11, 2024
Thread by @goodside on Thread Reader App – Thread Reader App
@goodside: PoC: LLM prompt injection via invisible instructions in pasted text Each prompt contains three sections: 1. An arbitrary question from the user about a pasted text (“What is this?”) 2. User-visible pasted...…
👀
— Jayson E. Street 💙 🤗💛 Hacker - Helper - Human (@jaysonstreet) January 13, 2024
"Juniper warns of critical RCE bug in its firewalls and switches"https://t.co/UmPE1sV6wp
I refuse to believe this is real. pic.twitter.com/G9jJrgNqOI
— RandomSprint (@RandomSprint) January 12, 2024
Decoding a ROM From a Picture of the Chiphttps://t.co/vutNkAGRkP
— hackaday (@hackaday) January 12, 2024
The problem with this study is that it may be obsolete. The problem is propaganda, which can be made with disinformation, misinformation, but also true information, not included in the study. This may undermine the relevance of this chart. #davos #wef24 @wef pic.twitter.com/3LVIj5sLzt
— Lukasz Olejnik, Ph.D, LL.M (@lukOlejnik) January 12, 2024
UK-Ukraine agreement on security cooperation. "work on ensuring a sustainable force capable of defending Ukraine ... through ... provision of security assistance and modern military equipment, across the land, air and sea, space and cyber domains" https://t.co/TXKFU2KxI9 pic.twitter.com/l1GhiX8dM8
— Lukasz Olejnik, Ph.D, LL.M (@lukOlejnik) January 12, 2024