January 13, 2024

                January 13, 2024

            January 13, 2024

            January 13, 2024
Our AppSec team is sharing things they wish they'd known before using Semgrep. Follow our seven-step plan for bootstrapping this static analysis tool, which is supported in over 30 programming languages. https://t.co/XXJE1QzBJX
— Trail of Bits (@trailofbits) January 12, 2024

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

abs: https://t.co/wNneQxSSeF

This @AnthropicAI paper demonstrates the training of backdoored models that, when triggered, involve switching from writing safe code to inserting code vulnerabilities.… pic.twitter.com/ApzY95BTCC
— Tanishq Mathew Abraham, Ph.D. (@iScienceLuvr) January 12, 2024

Why does putting this invisible Unicode garbage into the LLM even work? Tokenizers.

When the LLM gets it, the tokenizer splits the mangled text into the 'tag' characters and the original character. You end up with a sequence of 'tags-token-tags-token-tags-token' token ids.  1/ https://t.co/nlQtCCUiVy
— Rich Harang (@rharang) January 12, 2024

I revise my earlier tweet about this being the best prompt injection content in a few weeks. This is the biggest breakthrough and security issue since prompt injection itself.

Here's why:

- It's invisible

- It's near impossible to fix

That's only 2 things but the fact it is… https://t.co/cdHDa9ZIcw
— Joseph Thacker (@rez0__) January 11, 2024

PoC: LLM prompt injection via invisible instructions in pasted text pic.twitter.com/AY9HLzT2zB
— Riley Goodside (@goodside) January 11, 2024

Thread by @goodside on Thread Reader App – Thread Reader App
@goodside: PoC: LLM prompt injection via invisible instructions in pasted text Each prompt contains three sections: 1. An arbitrary question from the user about a pasted text (“What is this?”) 2. User-visible pasted...…

👀

"Juniper warns of critical RCE bug in its firewalls and switches"https://t.co/UmPE1sV6wp
— Jayson E. Street 💙 🤗💛 Hacker - Helper - Human (@jaysonstreet) January 13, 2024

I refuse to believe this is real. pic.twitter.com/G9jJrgNqOI
— RandomSprint (@RandomSprint) January 12, 2024

Decoding a ROM From a Picture of the Chiphttps://t.co/vutNkAGRkP
— hackaday (@hackaday) January 12, 2024

The problem with this study is that it may be obsolete. The problem is propaganda, which can be made with disinformation, misinformation, but also true information, not included in the study. This may undermine the relevance of this chart. #davos #wef24 @wef pic.twitter.com/3LVIj5sLzt
— Lukasz Olejnik, Ph.D, LL.M (@lukOlejnik) January 12, 2024

UK-Ukraine agreement on security cooperation. "work on ensuring a sustainable force capable of defending Ukraine  ... through ... provision of security assistance and modern military equipment, across the land, air and sea, space and cyber domains" https://t.co/TXKFU2KxI9 pic.twitter.com/l1GhiX8dM8
— Lukasz Olejnik, Ph.D, LL.M (@lukOlejnik) January 12, 2024

Don't miss what's next. Subscribe to the grugq's newsletter: