Berlin Bassline Brief #2: Apple MIE, VMWare, NPM & Github, MTP, Alignment gap, Honeytokens
"Exploring an Apple Memory Integrity Enforcement vulnerability claim, NPM and Github exploitation, language model reasoning gaps, and MTP-related updates"
"What does it mean, say the words, that the earth is so beautiful? And what shall I do about it?" - Mary Oliver
Security, Apple platforms, and the weekly "is Mythos real?"
SV security research firm Calif uses Mythos, and claims to find Apple vulnerability allowing Memory Integrity Enforcement bypass (skepticism is reasonable until something is published): https://www.heise.de/en/news/Security-Firm-Claude-Mythos-Discovers-macOS-Exploit-11295068.html
Bonus Apple Security
If you use VMWare Fusion in your systems, please patch: https://nvd.nist.gov/vuln/detail/CVE-2026-41702
Security, General
Generally, discussing NPM drama as someone whose focus isn't supply chain security feels to me like tipping everyone off to a great songwriter I just discovered named Taylor Swift – the phenomenon is sufficiently covered, and who could still be in the dark? And yet: look what TeamPCP compromising a maintainer with over 500 packages made me do.
https://opensourcemalware.com/blog/teampcp-compromises-npm-maintainer-with-over-540-packages (h/t @mccartypaul)
Followed up with a compromise of Github itself, presumably via one of the malicious VSCode extensions reported on last week (I knew they were trouble): https://www.bleepingcomputer.com/news/security/github-confirms-breach-of-3-800-repos-via-malicious-vscode-extension/
Interesting Paper
I usually choose something published in a peer-reviewed journal, not because they are the only source of good papers, but because we both have a similar stake in demonstrating good judgement and supporting serious work, but I enjoyed this one, so here comes an arXiv-only publication entitled “Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions” by Sushrita Rakshit, Hanwen Zhang, and Hua Shen, about their (perhaps slightly corny-name-having) frameworks for auditing the well-measured gap between what an LLM states about its values and the actions it actually takes:
[2605.09893] Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions
Large language models (LLMs) are often evaluated based on their stated values, yet these do not reliably translate into their actions, a discrepancy termed "value-action gap." In this work, we argue that this gap persists even under explicit reasoning, revealing a deeper failure mode we call "Pseudo-Deliberation": the appearance of principled reasoning without corresponding behavioral alignment. To study this systematically, we introduce VALDI, a framework for measuring alignment between stated ...
Interesting Tool (Github edition)
llama.cpp https://github.com/ggml-org/llama.cpp and oMLX https://github.com/jundot/omlx had releases this week supporting multi-token prediction (https://arxiv.org/pdf/2507.11851v1) for significantly faster inference on local large language models which support it; real-world token/s gains sound highly variable depending on model, model species, and task and I’ve heard everything from 10% faster to more than twice as fast. For offline workflows, I wouldn’t turn down even a 10% speed improvement.
Interesting Tool (conceptual toolkit edition)
Plant Honeytokens to Detect Intrusions: https://zeltser.com/plant-honeytokens (h/t @lennyzeltser)
Apple Platforms Security Concept of the Week
Memory Integrity Enforcement: https://security.apple.com/blog/memory-integrity-enforcement/