Misconfigurations in Google Kubernetes Engine (GKE)

A recent Unit 42 investigation uncovered a dual privilege escalation chain affecting Google Kubernetes Engine (GKE). Stemming from misconfigurations in GKE’s FluentBit logging agent and Anthos Service Mesh (ASM), this exploit chain could enable attackers with existing Kubernetes cluster access to escalate privileges.

Kubernetes, a widely adopted open-source container platform, benefits from GKE’s added features for cluster deployment and management. Yet, the inherent complexity of Kubernetes environments poses security risks, frequently originating from misconfigurations and excessive privileges.

If an attacker gains execution within the FluentBit container, possibly through exploiting a remotely vulnerable component, and the cluster has Anthos Service Mesh (ASM) enabled, the exploit chain is triggered. This allows the attacker to take full control of a Kubernetes cluster, leading to data theft, deployment of malicious pods, and disruption of cluster operations.

Palo Alto researchers characterize this exploit chain as a next-generation second-stage cloud attack, wherein attackers utilize their existing access to the Kubernetes cluster to propagate within the cluster or escalate privileges.

Now, let’s examine this dual privilege escalation chain, focusing on vulnerabilities in the default configuration of GKE’s logging agent, FluentBit, which runs automatically on all clusters, and in the default privileges within Anthos Service Mesh (ASM). ASM is an optional add-on that provides control over service-to-service communication within a GKE environment.

The Attack Chain

To launch this second-stage attack, the attacker initially needs to exploit the FluentBit container. This involves leveraging a remote code execution or arbitrary file read vulnerability, or breaking out of another container to gain access to the Node.

The attack begins by exploiting a misconfiguration in the FluentBit container, which automatically mounts the /var/lib/kubelet/pods volume. Within this directory, the kube-api-access volume holds projected service account tokens for each pod on the Node. By compromising the FluentBit pod, the attacker accesses these tokens, enabling them to impersonate a pod with privileged access to the Kubernetes API. This unauthorized access potentially allows the attacker to map the entire cluster and execute malicious actions based on their acquired privilege.

In the next phase, the focus shifts to Anthos Service Mesh’s Container Network Interface (CNI) DaemonSet, initially deployed to set up the Istio CNI plugin on every node. Installed with Anthos Service Mesh, the Istio-cni-node DaemonSet retains elevated permissions, allowing an attacker to create a new pod with these potent privileges.

By chaining these two exploits, the attacker attains full control over the Kubernetes cluster, escalating privileges to the status of a cluster admin.

Having compromised the FluentBit container and obtained privileged access to the Kubernetes cluster, the attacker exploits FluentBit’s default configuration, which mounts the /var/lib/kubelet/pods volume. This grants access to the kube-api-access-<random-suffix> directory and tokens from all pods associated with a Node. Leveraging the FluentBit DaemonSet, the attacker can replicate this compromise on each node, mapping the entire cluster and identifying tokens, including the Istio-Installer-container token.

Subsequently, the attacker exploits the post-installation excessive permissions of the ASM CNI DaemonSet, creating a new pod in the Kube-System namespace. Focusing on influential service accounts, the attacker targets the clusterrole-aggregation-controller (CRAC) service account, capable of appending arbitrary permissions to existing cluster roles. By updating the CRAC’s service account in the pod’s YAML file, the attacker secures the CRAC token. This token, mounted to the new pod, confers cluster admin privileges. The FluentBit misconfiguration is then exploited once more to obtain the CRAC token, completing the chain and establishing the attacker as a cluster admin.

Patches for Google Kubernetes Engine (GKE) and Anthos Service Mesh (ASM)

The required patches are included in the following versions of Google Kubernetes Engine (GKE) and Anthos Service Mesh (ASM):

GKE Versions:

1.25.16-gke.1020000
1.26.10-gke.1235000
1.27.7-gke.1293000
1.28.4-gke.1083000

ASM Versions:

1.17.8-asm.8
1.18.6-asm.2
1.19.5-asm.4

Proactive Measures and Recommendations

Here are the security recommendations to mitigate the risk of misconfigurations leading to vulnerabilities in cloud environments, including Kubernetes clusters:

Regular Audits and Updates: Conduct frequent security audits for critical services, employing automated tools for continuous checks. Keep all components, including Kubernetes clusters, up to date with the latest security patches.
Follow Best Practices: Adhere to best practices provided by your cloud service provider and specific services. Conduct training programs to bolster teams’ awareness of cloud security best practices.
Limit Permissions and Follow the Principle of Least Privilege: Restrict permissions to the minimum necessary, following the principle of least privilege for user accounts, service accounts, and components. Leverage built-in security features like IAM and security groups.
Implement Network Policies: Utilize network policies to control the flow of traffic between pods within your Kubernetes cluster, minimizing the attack surface.
Monitor and Analyze Logs: Implement robust logging and monitoring solutions for prompt detection and response to unusual activities or security incidents.

Post Views: 185