Part2: Kubernetes AWS Resource Access: KIAM

Part2: Kubernetes AWS Resource Access: KIAM

In the previous post, we learned to deploy Kube2Iam to manage access between Kubernetes workloads and other AWS resources. In this tutorial, we will be solving the same problem but a much more advanced tool, KIAM.

Kiam came into existence to solve 2 major problems in kube2iam. They were:

  1. Data races under load condition: When you have very high spikes in application load and there are a large number of pods spinning up in the cluster, sometimes kube2iam returns incorrect credentials to those pods.
  2. Prefetch credentials to reduce start latency and improve reliability: This was needed so that before the container process boots in the pod, access credentials have been already assigned to the IAM role specified in the pod spec.

Other things which got added were:

  • Use of structured logging to improve the integration into your ELK setup with pod names, roles, access key ids etc.
  • Use metrics to track response times, cache hit rates, etc. These metrics can be readily scraped by Prometheus and rendered over Grafana.

Overall Architecture


Kiam is based on agent-server architecture.

Kiam Agent
This is the process that would typically be deployed as a DaemonSet to ensure that Pods have no access to the AWS Metadata API. Instead, the agent runs an HTTP proxy which intercepts credentials requests and passes on anything else.

Kiam Server
This process is responsible for connecting to the Kubernetes API Servers to watch Pods and communicating with AWS STS to request credentials. It also maintains a cache of credentials for roles currently in use by running pods, ensuring that credentials are refreshed every few minutes and stored in advance of Pods needing them.

Implementation

Similar to Kube2iam, for a pod to get credentials for any IAM role, that role should be specified as an annotation in the deployment manifest. Additionally, you also need to specify what all IAM roles can be allocated inside a particular namespace using appropriate annotations. This enhances security and you can fine-grain control of IAM roles.

Creating and Attaching IAM roles

  1. Create an IAM role named kiam-server with appropriate access to AWS resources.

  2. You now need to enable a trust relationship between the Kiam-server role and the role attached to Kubernetes' master nodes. We should make sure that the role attached to Kubernetes worker nodes has very limited permissions. This is because all the API calls or access requests are essentially made by containers running on the node and they will be receiving credentials using Kiam. Therefore the worker node IAM roles need not access a large number of AWS Resources. To do so:

    a. Go to the newly created role in AWS console and Select Trust Relationships tab
    b. Click on Edit Trust Relationship
    c. Add the following content to the policy:


{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KUBERNETES_MASTER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

  1. Add inline policy to the Kiam-server role
{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
     	"sts:AssumeRole"
   	 ],
   	"Resource": "*"
   }
 ]
}
  1. Create the IAM role (let's call it my-role) with appropriate access to AWS resources.
  1. Enable Trust Relationship between the newly created role and the Kiam-server role. In order to do so:
    a. Go to the newly created role in AWS console and Select Trust relationships tab
    b. Click on Edit trust relationship c. Add the following content to the policy:
{
  "Sid": "",
  "Effect": "Allow",
  "Principal": {
    "AWS": "<ARN_KIAM-SERVER_IAM_ROLE>"
  },
  "Action": "sts:AssumeRole"
}

  1. Enable Assume Role for Master Pool IAM roles. Add the following content as inline policy to Master IAM roles:
{
  "Version": "2012-10-17",
  "Statement": [
   {
     "Effect": "Allow",
     "Action": [
     	"sts:AssumeRole"
   	 ],
   	"Resource": "<ARN_KIAM-SERVER_IAM_ROLE>"
   }
 ]
}

All the communication between Kiam agent and server is TLS encrypted. This enhances security. To do we need to first deploy cert-manager in our Kubernetes cluster and generate certificates for our agent-server communication.

Deploying Cert Manager and Generating Certificates


Deploy Cert-Manager

  1. Install CustomResourceDefinition resources separately
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.8/deploy/manifests/00-crds.yaml
  1. Create the namespace for cert-manager
kubectl create namespace cert-manager
  1. Label the cert-manager namespace to disable resource validation
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
  1. Add the Jetstack Helm repository
helm repo add jetstack https://charts.jetstack.io
  1. Update your local Helm chart repository cache
helm repo update
  1. Install the cert-manager Helm chart
helm install --name cert-manager --namespace cert-manager --version v0.8.0 jetstack/cert-manager

Generate CA private key and self-signed certificate for Kiam agent-server TLS

  1. Generate the .crt file
openssl genrsa -out ca.key 2048openssl req -x509 -new -nodes -key ca.key -subj "/CN=kiam" -out kiam.cert -days 3650 -reqexts v3_req -extensions v3_ca -out ca.crt
  1. Save the CA key pair as a secret in Kubernetes
kubectl create secret tls kiam-ca-key-pair \  --cert=ca.crt \  --key=ca.key \  --namespace=cert-manager

a. Create the Kiam namespace using manifest below

apiVersion: v1
kind: Namespace
metadata:
  name: kiam
  annotations:
    iam.amazonaws.com/permitted: ".*"
---

  b. Deploy ClusterIssuer, Certificate and issue the certificate

apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: kiam-ca-issuer
  namespace: kiam
spec:
  ca:
    secretName: kiam-ca-key-pair
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-agent
  namespace: kiam
spec:
  secretName: kiam-agent-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: kiam-server
  namespace: kiam
spec:
  secretName: kiam-server-tls
  issuerRef:
    name: kiam-ca-issuer
    kind: ClusterIssuer
  commonName: kiam
  dnsNames:
  - kiam-server
  - kiam-server:443
  - localhost
  - localhost:443
  - localhost:9610
---

c. Test if certificates are issued correctly

kubectl -n kiam get secret kiam-agent-tls -o yamlkubectl -n kiam get secret kiam-server-tls -o yaml

Annotating Resources

  1. Add the IAM role's name to Deployment as an annotation
 apiVersion: extensions/v1beta1
 kind: Deployment
 metadata:
 	name: mydeployment
 	namespace: default
 spec:
 ...
 	minReadySeconds: 5
 	template:
     	annotations:
       	iam.amazonaws.com/role: my-role
   	spec:
     	containers:
 ... 
  1. Add role annotation to the namespace in which pods will run
 apiVersion: v1
 kind: Namespace
 metadata:
 	name: default
 	annotations:
 		iam.amazonaws.com/permitted: ".*"

The default is not to allow any roles. You can use a regex as shown above to allow all roles or can even specify a particular role per namespace.

Deploying KIAM Agent and Server


Kiam Server

The manifest provided below will deploy the following:

  1. Kiam Server demonset which will run on Kubernetes Master Nodes (configure to use the TLS secret created above)
  2. Kiam Server service
  3. Service account, ClusteRrole and ClusterRoleBinding required by Kiam server
---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-read
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  verbs:
  - watch
  - get
  - list
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-read
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kiam-write
rules:
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kiam-write
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-write
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kiam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-server
spec:
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kiam
        role: server
    spec:
      tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule
      serviceAccountName: kiam-server
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
      nodeSelector:
      nodeSelector:
        kubernetes.io/role: master
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --level=info
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --role-base-arn-autodetect
            - --assume-role-arn=<KIAM_SERVER_ROLE_ARN>
            - --sync=1m
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: kiam-server
  namespace: kiam
spec:
  clusterIP: None
  selector:
    app: kiam
    role: server
  ports:
  - name: grpclb
    port: 443
    targetPort: 443
    protocol: TCP

Note:

  1. The scheduler toleration and node selector which we have in place here makes sure that the Kiam server pods gets scheduled on Kiam master-nodes only. Therefore, a little while ago we enabled the trust-relationship between the kiam-server IAM role and the IAM role attached to Kubernetes master-nodes.
...
       tolerations:
       - key: node-role.kubernetes.io/master
         effect: NoSchedule 
...
...
      nodeSelector:
        kubernetes.io/role: master
...
  1. The kiam_server_role_arn is provided as an argument to the Kiam server container. Make sure you update <KIAM_SERVER_ROLE_ARN> field in the manifest above to the arn of the role which you created.
  1. The ClusterRole and ClusterRole binding created for Kiam server grants it minimal permissions required to operate effectively. Please consider thoroughly before updating them to anything else.
  1. Make sure the path to SSL certificates is set correctly according to the secret which was created using cert-manager certificates. This is important to establish secure communication between the Kiam server and Kiam agent pods.

Kiam Agent

The manifest provided below will deploy the following:

  1. Kiam Agent Demonset which will run on Kubernetes Worker Nodes only.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kiam
  name: kiam-agent
spec:
  template:
    metadata:
      labels:
        app: kiam
        role: agent
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/ssl/certs
        - name: tls
          secret:
            secretName: kiam-agent-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          image: quay.io/uswitch/kiam:b07549acf880e3a064e6679f7147d34738a8b789
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=cali+
            - --json-log
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --gateway-timeout-creation=30s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

It should be noted that Kiam agents also run with host networking set to true, similar to kube2iam. Also, one of the arguments to the Kiam agent container is the name of the Kiam service to access the Kiam server, in this case, kiam-server:443 therefore, we should deploy the Kiam server before deploying Kiam agent. Also, the container argument --gateway-timeout-creation defines the time to wait for Kiam server POD to be up before the agent tries to connect. It can be tweaked depending upon how long the POD takes to come up in your Kubernetes cluster. Ideally, 30 seconds of wait period is enough time to wait for it.

Testing

Testing the KIAM set-up is same as kube2iam. You can fire-up a test POD and curl the metadata to check the assigned role. Please ensure that both deployment and namespace are properly annotated.


Conclusion

Implementation Complexity

It must be clear by now that Kube2IAM is the easier to implement. You basically need to run a privileged daemonset and you are all set. However, with the ease of set-up, you compromise on efficiency. As stated previously, kube2iam might not perform reliably under high load conditions.

KIAM’s implementation needs `cert-manager` running and you unlike kube2iam you also need to annotate the namespace along with the deployment. However, in the end, it is all worth it and if you use the manifests provided about you should be able to get a production-ready Kiam set-up up and running in no time.

Performance

Kube2iam is more suited for non-production environments or scenarios where a large surge of traffic is not expected. However, Kiam can be used in all the cases, given you have resources for running cert-manager and your master’s nodes are beefy enough to handle a demonset running on them. It would be a little more work to set-up KIAM but we feel that it is a tool that is being used in production very widely and under active development. The reason KIAM came into existence was to overcome some fundamental issues with kube2iam. We highly recommend going with KIAM and with all those manifests provided above your set-up will be seamless and production-ready.