Some common JWK Set gotchas
Welcome to the latest edition of Illuminated Security news. In this month’s newsletter, we’re going to take a look at JWK Sets, an apparently simple specification for publishing public keys that has some surprisingly subtle gotchas.
Before we get into that, an update on the company. Illuminated Security now has a website with some snazzy branding from the design geniuses at Level. Check out the logotype and brand mark below. I’m busy preparing our first offering, the definitive guide to JSON Web Tokens (and alternatives), which will launch soon. In the meantime, I am also now looking to take on new consulting/advisory gigs, so if you’re looking for an OAuth/OIDC/JWT/AppSec/cryptography expert then give me a shout!
JWK Sets
OK, back to the main business of this newsletter: JWK Sets. A JWK is a JSON Web Key, a standard for describing cryptographic keys in JSON format. For example, an RSA public key encoded as a JWK looks something like this (eliding the modulus for space reasons):
{
"kty": "RSA",
"kid": "rsa_key_1",
"use": "sig",
"e": "AQAB",
"n": "p_nWp4WiiphxyM0VBeJSWKh173-NTQ..."
}
This has some fields that describe the type of key (kty
—RSA in this case) and the parameters of the key (e
, the public exponent, and n
, the modulus, for an RSA public key). It also has some other fields that give an identifier for the key (kid
) and describe constraints on its use: in this case a use
value of sig
means that they key is intended for verifying digital signatures, and not for encryption or any other use. We'll come back to those later.
A JWK Set is then simply a collection of JWKs, also represented as a JSON object. This object has a "keys"
field that contains an array of JWKs:
{
"keys": [
{ "kty": "RSA", ... },
{ "kty": "EC", ... },
...
]
}
A common pattern in OpenID Connect and other protocols is for a server to publish a JWK Set at some well-known URL. When a relying party needs to verify a JSON Web Token (JWT) signed by that server, it retrieves the JWK Set (over HTTPS, but more on this later) from the URL and uses the keys to verify the signature. This seems straightforward, but in practice there are a lot of subtle details that developers are often not aware of.
Gotcha #1: Key IDs are not necessarily unique
The presence of a Key ID (kid
) field in the JWK may lull you into the first trap for the unsuspecting developer: assuming that such IDs are unique. Unfortunately, uniqueness is not guaranteed by the spec and JWK Sets with duplicate Key IDs are common in practice. This often surfaces as an assumption in developer APIs that look like the following:
def lookup_jwk_by_id(jwks: JWKSet, kid: String) -> JWK
The intended use here is that when the app receives a JWT, it retrieves the JWK Set and then finds the right key by looking up the kid
specified in the JWT header, as described in the JWT spec:
When used with a JWK, the "kid" [header] value is used to match a JWK "kid" parameter value.
Unfortunately, two problems can occur with this approach:
- There may be more than one JWK in the JWK Set that matches the given
kid
- There may be no JWK that matches the given
kid
, either because of a bug (resulting in mismatched IDs) or because the key has been revoked or retired for some reason.
The correct approach is to have the lookup_jwk_by_id
method return a list of candidate keys and then try each of them in turn. If the JWT signature verifies with any of those keys then it is valid, otherwise it should be rejected. This naturally also handles the case that none of the keys match: simply return an empty list (or ignore the Key ID and return all the keys in the set, which can be a more robust strategy).
A common worry among developers who aren't experts in cryptography is that trying multiple keys somehow reduces the security of the signature scheme. Don't worry, it doesn't. What does matter is making sure you use the correct algorithm with the correct key, and that you don't use keys that have been revoked or retired, which we'll discuss more in the other gotchas.
Gotcha #2: Not all keys in a JWK Set are for the same functionality
Just because a JWK is present in a JWK Set, doesn't mean its actually intended to be used for the purpose you are using it for. For example, the JWK Set for an OIDC provider will contain keys that you can use for verifying ID Token signatures, but they may also contain keys that are used for encrypting request objects, or for signing OAuth Access Tokens and a bunch of other use-cases. You should not use keys intended for one function for something else, as this may open your application up to cross-protocol attacks. A cross-protocol attack occurs when a JWT (or other credential) signed for one purpose ends up being valid for a completely different purpose, resulting in a security breach.
The best way to protect against cross-protocol attacks is to publish completely separate JWK Sets for each function of your application—one for ID token signing keys, one for access token signing keys, another for request object encryption keys, and so on. Unfortunately, this has not been common practice and the specifications themselves tend to bundle everything into a single JWK Set URL, resulting in a jumble of keys for different purposes all mixed up together.
It is then crucial that you pay attention to other constraints specified on each JWK to ensure that you only use them for their intended purpose:
- The
kty
field should be checked to ensure it is consistent with the JWT signature or encryption algorithm to avoid algorithm confusion attacks. - The
use
field indicates the general type of usage, eithersig
(signature) orenc
(encryption). This is the least specific constraint, but is widely used. - A
key_ops
field, if present, lists the intended uses of the key:sign
,verify
,encrypt
, and so on. In practice, this boils down to eitherverify
orencrypt
for a JWK public key (and you better not be publishing private keys in a public JWK Set!), so it is equivalent to theuse
field in most cases. - Ideally each JWK should have an
alg
field that specifies the exact JOSE encryption or signature algorithm such asRS256
orES256
. This is a much more specific indication of usage, but it's still likely that you use the same algorithm for different types of objects (access vs ID tokens, for example) so it doesn't completely eliminate cross-protocol attacks. - If the JWK has a certificate associated with it (via a
x5c
field), then you can also check the key usage, extended key usage, and other constraints to ensure they are also consistent.
In general, you should view the process of selecting a key from a JWK Set as a series of filters. You start with the full JWK Set and then filter out irrelevant keys as you check each known constraint: eliminating keys with a different kid
, then eliminating those intended for a different use
, with incompatible key_ops
, and so on. At the end of the process you will be left with either a smaller set of keys to try or an empty list (in which case you should reject the JWT as invalid).
Unfortunately, none of these methods are completely foolproof and you may still end up trying to verify a JWT with a key intended for some other purpose. This can open up your application to cross-protocol attacks as we've discussed. The recently published JWT Best Current Practice document recommends using explicit typing to mitigate this threat (along with other recommendations, such as checking the aud
(audience) claim). Explicit typing makes use of the JWT typ
(type) header to distinguish JWTs intended for different purposes. For example, RFC 9068 specifies the MIME-type application/at+jwt
to specify a JWT intended for use as an OAuth2 access token. A compliant implementation should reject JWTs that have a different type header, preventing other types of tokens being accidentally accepted.
Gotcha #3: Be prepared to handle unknown key types and constraints
One non-standard way that some applications have chosen to address the cross-JWT confusion issue is to invent their own values for the use
or other fields. For example, the Open Banking specifications added a non-standard "use": "tls"
value.
New specifications can also add new values to apparently fixed fields, such as RFC 8037, which added a new okp
key type and new values for the crv
(Elliptic Curve) field. This can cause a problem if you have used an enum
or other hard-coded datatype to represent these fields internally (ask me how I know).
Use strings to represent extensible field values and be prepared to handle unknown values.
Gotcha #4: TLS does not guarantee security
Perhaps the biggest gotcha on this list is to assume that TLS is infallible. Just because the JWK Set was fetched over TLS (HTTPS) doesn't mean that it is trustworthy. This usually goes along with statements such as "well, if TLS is insecure then I have bigger problems". But actually, in the case of protocols like OpenID Connect or SAML you probably don't have bigger problems. As Ryan Sleevi points out in his analysis of the FastFed standard, TLS connections are not always secure: certificates get mis-issued, private keys do sometimes get compromised, TLS connections are insecurely configured, and so on. Compounding this is the fact that certificate revocation on the Web is still somewhat of a work in progress, especially when we consider non-browser clients.
The upshot of this is that you need to be really careful when fetching public keys from a JWK Set URI, especially if you are going to cache those keys for any length of time. Otherwise, a temporary certificate compromise event can turn into a long-term key compromise event if an attacker is able to inject their own keys into the JWK Set.
Sleevi provides a very comprehensive analysis of the potential problems and mitigations and I strongly recommend anyone responsible for a production federation deployment reads his document in detail. I'll summarise what I consider to be the most pressing points here (with a few suggestions of my own), but this really isn't a substitute for reading the document in full:
- Avoid caching a JWK Set indefinitely or for excessively long periods of time. Respect
Cache-Control
headers on the response, but don't blindly trust them. Consider setting a maximum cache time of a few minutes or so. How long are you prepared to continue trusting a compromised key? What is the potential business impact of such a compromise? (If you re-fetch the JWK Set whenever a missingkid
is found, remember to also rate-limit these connections so you don't overwhelm the server if thekid
is misconfigured). - On a similar note, ensure that "permanent" HTTP redirect status codes on this endpoint are treated as temporary. Otherwise an attacker can redirect a hijacked connection once and then serve you bogus keys from their own server forever. Consider disabling redirect handling entirely when connecting to these endpoints.
- If this is a private (closed) deployment, then consider using a private CA and configure clients to trust only that private PKI root certificate(s). Don't include the entire WebPKI public CA certs in your truststore unless you're actually using WebPKI.
- Ensure your servers and clients are configured to use modern TLS settings and are performing appropriate checks, e.g. make sure that certificate revocation checking is enabled if feasible in your environment. (I go into some detail on secure client HTTPS configuration in chapter 7 of API Security in Action).
Summary
In short, validating and processing JWK Sets is (as always with JOSE) a lot more complex than it first appears. These gotchas only really scratch the surface. If you really want to understand JWTs, JOSE, JWK (not to mention alternatives like PASETO, Macaroons, and Biscuit), you'll love our JSON Web Tokens Illuminated course that will be launching in May. Subscribe to this newsletter to be the first to hear when it goes live. In the meantime, I am available for consulting work to help you engineer your deployments for security: see the website for details.