The Problem

Authentication and authorization in a microservice environment can be tricky. There are a lot of moving parts, and care must be taken when it comes to deciding what solution you want to deploy - are there suitable open-source options, or would it be better to roll your own?

Recently, I’ve been presented with a challenge that I had to overcome with a strict deadline, and I wanted to share the solution I came up with.

Here’s a description of the current architecture in use:

  • There are multiple publicly available REST APIs.
  • These APIs are accessed through a proprietary public API gateway (let’s refer to this as gateway-1)
  • Consumers of said APIs authenticate via a special token (arbitrary string) when sending the initial request
  • The request is then forwarded to the cluster with a client id and client secret if authentication within the gateway succeeded
  • The Ingress routes the request to an internal Kong Gateway
  • This resources authorizes the request via OAuth (client credentials flow)
  • If authorization succeeds, the request is finally routed to the application.

And here are the new requirements:

  • The authorization will be completely remodelled using a second proprietary public API gateway (we call it gateway-2)
  • All public REST APIs are registered in said gateway with a HMAC 256 secret which the owner of the API has to define
  • Consumers still authenticate via a special token (arbitrary string) when sending the initial request
  • The host of the APIs must not change
  • If authentication succeeds, the request is forwarded to the cluster with a JWT
  • Said JWT is signed with the HMAC 256 secret that was configured by the owner of the API
  • The JWT signature and issuer must be verified within the Kubernetes cluster
  • We must not make any changes to the REST APIs to do this
  • If the JWT is valid, the request is finally routed to the application
  • And to top it all off, both setups must work simultaneously to give consumers enough time to migrate to the new gateway

Fun!

Visualising the setup

Okay, let’s start from the beginning. I’m a visual person, so the first thing I did was drawing the current solution and then figuring out how I can integrate something that solves the new requirements:

c o n s u m e r s e n d i n g r e q u e s t a u t g h a e t n e t w i a c y a - t 1 i o n r e q u e s t / w c l i e n t c r e d e n t i a l s n g i n x i n g r e s s K u b e r n e t o k e a o s u n t g c h - l g u f a s l t t o e e w w r a y a p p l i c a t i o n

Looks more manageable to me. Pretty standard stuff, nothing too fancy. But how are we going to fit in the new requirements? This was my second draft after I put in all the constant bits that I could not change.

c o n s u m e r o n l g g e d a a w t t g e e g a w w a t a a t e y y e w - - w a 1 2 a y y t o s a m e h o s t . . . n g i n x i n g r e s s K u b e r n e s t t k e i o s l n l g c - l r g u e a s q t t u e e i w r r a e y d a p p l i c a t i o n

Okay, fair enough. We now have to API gateways, and they both route to the same host on which our Ingress controller sets up the Ingress resources. The Kong components were still required of course, since we still need the old setup. The application will also not be changed. But how are we supposed to validate the JWT coming from the new API gateway?

Istio

Istio is an open-source service mesh that can be put onto existing distributed applications. It has a ton of features that can help you with traffic behaviour, service-to-service communication, monitoring, authentication and authorisation. The last two bits are interesting to us right now, so let’s check it out.

Istio uses the concept of policies, that define what you can and cannot do when it comes to accessing other services or resources within the cluster. Our requirements for that are quite simple:

We only want to allow requests that carry a JWT on a specific HTTP header; the JWT must be signed using our HMAC 256 secret and has to have been issued by our expected issuer claim.

Okay, let’s model that into a so-called AuthorizationPolicy:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: require-jwt
  namespace: foo
spec:
  selector:
    matchLabels:
      app: rest-api
  action: ALLOW
  rules:
  - when:
    - key: request.auth.principal
      values: ["noreply@issuer.com/gateway-2"]

What does this do?

  • The selector ensures that all applications with this label are using this policy
  • We ALLOW the requests if the token was issued by noreply@issuer.com and the sub claim of the JWT is gateway-2

request.auth.principal is an attribute calculated by Istio itself. This value is containing a string with the following format: "<Issuer>/<Subject>". A full list of all available attributes is available here.

We now told the cluster which requests it should let through, but we didn’t tell it how it can validate those requests yet. This is where the RequestAuthentication resource comes into play:

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
  name: require-jwt
  namespace: foo
spec:
  selector:
    matchLabels:
      app: rest-api
  jwtRules:
  - issuer: "noreply@issuer.com"
    jwksUri: ???

Damn, we’re using a symmetric key to sign the JWT, and that means that we can´t make the key publicly available via some kind of JWKS endpoint! Luckily for us, the RequestAuthentication resource can be configured to use a embedded JWKS. So let’s build one:

{
    "keys": [
        {
            "kty": "oct",
            "k": "<base64-encoded-secret>",
            "alg": "HS256"
        }
    ]      
}

This took me a while to figure out. I recommend reading through the JSON Web Key RFC, since that helped me figure out how the JWKS needed to be put together.

So finally we end up with this:

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
  name: require-jwt
  namespace: foo
spec:
  selector:
    matchLabels:
      app: rest-api
  jwtRules:
      - issuer: "noreply@issuer.com"
        jwks: '{"keys":[{"kty":"oct","k":"<base64-encoded-secret>","alg":"HS256"}]}'
        fromHeaders:
            - name: secret-header 
              prefix: "Bearer "

To finalize our setup we should add a route in our Ingress controller to tell Kubernetes that it should route the requests from gateway-2 to the Service directly. Since we’re required to run on the same host, I chose to add a prefix to the URI - /istio/*.

Awesome, now we should have everything required to fulfil the requirements, right?

Computer says no

After deploying these changes to DEV, I immediately saw that the tests for the old gateway were failing. I was receiving 403: RBAC: access denied on each request. Wait a minute.

When deploying Istio in a Kubernetes cluster, there are two things that are happening:

  • The Istio operator is deployed
  • You tell Istio with labels on a namespace where you want to use it

If you restart the applications in these target namespaces, you’ll see that each pod starts up a sidecar container - that’s the Envoy proxy. This proxy is responsible for enforcing our policies, and all requests that arrive at the Service of the application will be routed through the it. And that means, that all requests coming in through gateway-1 are denied because they don´t carry a JWT that we specifically require with our AuthorizationPolicy!

So how can we fix this?

mTLS & policy refinement

After scouring the documentation and reading about how Istio works, I was able to come up with a pretty simple solution. The initial plan was to only use the necessary parts required to get the job done, but this limited my ability to use more granular policies. Istio automatically issues certificates with istiod (operator) and does all the heavy lifting, including key- and certificate rotation. You can read about how that works in detail here.

We can use the identities in those certificates (the documentation calls them principals) to specifically target particular services in our policies! This in turn means, that we should be able to allow request coming in from gateway-2 only if they have a valid JWT, and the requests coming in from gateway-1 are let through without this requirement.

The first thing we need to do to set this up is to inject the sidecars into our nginx Ingress controller, and then give our Kong gateway the same treatment. The injection happens when you set this label: sidecar.istio.io/inject: 'true' and restart the pods.

The nginx documentation even has their own page for setting it up with Istio - https://docs.nginx.com/nginx-ingress-controller/tutorials/nginx-ingress-istio/

After the pods have come up successfully again, you’re now able to utilize the principals in your policies. Let’s update our AuthorizationPolicy to make use of those:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: require-jwt
  namespace: foo
spec:
  selector:
    matchLabels:
      app: rest-api
  action: ALLOW
  rules:
  - from: 
    - source:
        principals: ["cluster.local/ns/foo/sa/nginx"]
    - when:
      - key: request.auth.principal
        values: ["noreply@issuer.com/gateway-2"]
  - from:
    - source:
        principals: ["cluster.local/ns/foo/sa/kong"]

What does this mean?

  • We specify two rules: one for our nginx Ingress and one for the Kong gateway
  • Requests coming in from nginx require a JWT token just like before
  • Requests coming in from Kong are allowed without any further checks

A note on principals: they are put together with the following formula: <TRUST_DOMAIN>/ns/<NAMESPACE>/sa/<SERVICE_ACCOUNT> The default trust domain is cluster.local, unless you specify a custom one during the installation. Naturally, you also need to reference the correct ServiceAccount in these policies, NOT the Service name!

After the updated AuthorizationPolicy was deployed, all tests were green. Mission success!

Final words

This was a fun problem to solve. Thankfully, there are so many awesome open-source projects out there that enable us to build these complicated systems. Is this an optimal setup? No, not at all. But it fulfils the requirements, it works, it’s reasonably fast and it comes with the added benefit of having service-to-service communication setup using mTLS.

This is the final architecture that we setup:

c o n s u m e r o n l g g e d a a w t t g e e g a w w a t a a t e y y e w - - w a 1 2 a y y t o s a m e h o s t . . . / n w g i e n n x v o i y n g s r i e d s e s c a K r u b e r / n i e / s t w t e k i s o e o n n / c g v * l - o u g y s a t t s e e i J r w d W a e T y c a c r h e c k a p p l i c a t i o n

References