readwrite

Archives
Subscribe
January 5, 2026

Edition 6 – Using RegEx to catch state-sponsored hackers

Hi,

Hakan here. This one is going to be straightforward. Both Jan and I are using RegEx quite regularly, so this is a how to: using RegEx to find out more about hackers allegedly operating out of the DPRK.

Quick sidebar: Pivoting

Since I'm covering cybersecurity as a beat, it helps me a lot to understand industry slang and go-tos used by the people doing this stuff on a daily. It's just easier for me to find information that way. Pivot is one such slang, it even has its own conference (supposed to be very good, I've never been).

In between the years I've read "The Art of Pivoting" (link to the Github repo. I really recommend to go read the book if you want to understand how people within the infosec-community use pivots to move from A to B (malware to domain, one finding to many findings etc.). This blog is one way I'm doing it.

If you have stories about RegEx and how those have helped you along the way, or if you have any kind of feedback, please do send it to: readwritenewsletter@proton.me

From Defcon 33 to DredSoftLabs

Some months ago, I watched a DEFCON-talk on using regex as a hacker and in that presentation I realized that you can search Github via Regex. I didn't know that, I was hyped!

And then, recently, I came across this blogpost by Mees van Wickeren. In it, he described a way to tie a company called DredSoftLabs to DPRK-related actors. More context on the DPRK-actors can be found at 38North, Google, and the State Department.

In the Medium blog, Mees shares this bit:

patterns.png

Whenever I read "pattern" to me it translates as "they're doing this more than once", which, sure enough, is exactly what the hackers did. Mees shared a search-pattern for Github that at the time resulted in 77 code repositories (right now there are 84 repositories).

X-Secret-Key

I've looked at those and noticed that some of them had this data blob eC1zZWNyZXQta2V5 in there while others had eC1zZWNyZXQtaGVhZGVy. Same same, but different. This is exactly when regex is powerful. So I wrote a rule that matches on eC at the start, ZWNor ZWs somewhere in between and ends on V5 or Vy, which at the time of this writing results in 159 repositories.

I should have read the initial post by Mees more closely, because he already had noted that the data blobs were encoded strings. Apparently, sometimes I need things to be said to me twice. A different researcher pointed out to me that my regex was rather broad as the strings it was matching on were base64-encoded, one being x-secret-key and the other one x-secret-header.

About base64: For now, let’s just accept that it is a different way of representing text strings or binary data - different as in other than with 0’s and 1’s (for binary data) or the letters A-Z (for normal text, like this paragraph). The Wikipedia article is quite good, yet nerdy.

As a pattern, this regex I created doesn't help, it's too generic. There could be various reasons as to why a string like x-secret-header might be encoded in base64.

Don't click, pls

Short, but important interlude: If I defang a url (google[.]com instead of google.com), this basically means that I do not know and haven't checked whether the site is a.) still live and b.) serving malware, so best to not visit without necessary precautions.

After looking more closely at some of the repositories, the researcher suggested I look more closely at the API_KEY-bits. Usually, API-keys are rather random and being used for authentication, but in this case, they were used to mask a URL.

Odd URLs

The supposed API-key aHR0cHM6Ly9hcGktc2VydmVyLW1vY2hhLnZlcmNlbC5hcHAvYXBpL2lwY2hlY2stZW5jcnlwdGVkLzMxNA== is the base64-encoded representation of the URL https://api-server-mocha[.]vercel[.]app/api/ipcheck-encrypted/314. (Mees had already posted this, I know now!).

This is rather odd behaviour and, as such, way better as a specific pattern for this specific campaign, and therefore way better suited for a regex.

After some more tweaking, I realized that the repositories had the value stored either as ACESS_KEY or API_KEY. So, I wrote a regex that matches on one of those two combined with HTTPS: 150 findings. Some of them are attacker-related, others are potential victims, and in between there are repositories warning about this scam operation. From a reporter's perspective, this is very nice, as I can reach out to people and hear their stories, maybe even alert them.

More URLs

Best of all: You can use regex for further pivoting. Since our regex matches only on the HTTPS-part of the URL, we will find more URLs than the one Mees highlighted. To give you three examples, all sourced from the first page of the 150 findings.

aHR0cHM6Ly9hcGkubnBvaW50LmlvL2U2YTZiZmI5N2EyOTQxMTU2Nzdk
https://api[.]npoint[.]io/e6a6bfb97a294115677d

aHR0cHM6Ly9jbG91ZGFwaS51czIubW9ja29vbi5hcHAvdjIvdHJhY2tzL2Vycm9ycz9pZD1VamNHZ29oMVBN
https://cloudapi[.]us2[.]mockoon[.]app/v2/tracks/errors?id=UjcGgoh1PM

aHR0cHM6Ly9yZXN0LWljb24tcHJvdmlkZXIubGluay9pY29ucy8=
https://rest-icon-provider[.]link/icons/

(Mees is tracking those already, I've asked him.) With those URLs at hand, you can do some more searching, maybe find associated malware, phishing-mails, more code, maybe even hints as to who the hackers are. I haven't done this so far, I just wanted to write this short piece you're reading now (thanks!).

Attribution

For the purpose of this newsletter, I didn't investigate the attribution bit, though I will say I was in touch with two other researchers with extensive knowledge on DPRK and both of them said that this tracks with what they've seen.

That's it for this week's edition! Hope you enjoyed it.

Don't miss what's next. Subscribe to readwrite:
Powered by Buttondown, the easiest way to start and grow your newsletter.