Ensure that the Auto-Repair feature is enabled for all your GKE cluster nodes to help maintain node health. Google Kubernetes Engine (GKE) triggers a repair action if a node reports consecutive unhealthy status reports for a given time threshold, such as when a node broadcasts a "NotReady" status, fails to broadcast any status, or runs out of disk space.
PoC代码[已公开]
id: gcloud-gke-auto-repair-disabled
info:
name: GKE Node Pools Without Auto-Repair Enabled
author: princechaddha
severity: medium
description: |
Ensure that the Auto-Repair feature is enabled for all your GKE cluster nodes to help maintain node health. Google Kubernetes Engine (GKE) triggers a repair action if a node reports consecutive unhealthy status reports for a given time threshold, such as when a node broadcasts a "NotReady" status, fails to broadcast any status, or runs out of disk space.
impact: |
Without auto-repair enabled, unhealthy nodes may remain in a degraded state indefinitely, potentially affecting the availability and reliability of your workloads running on those nodes.
remediation: |
Enable auto-repair for your GKE cluster node pools using:
gcloud container node-pools update POOL_NAME --cluster=CLUSTER_NAME --region=REGION --enable-autorepair
reference:
- https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-repair
- https://www.trendmicro.com/cloudoneconformity/knowledge-base/gcp/GKE/enable-auto-repair.html
tags: cloud,devops,gcp,gcloud,gke,kubernetes,reliability,maintenance,gcp-cloud-config
flow: |
code(1)
for(let projectId of iterate(template.projectIds)){
set("projectId", projectId)
code(2)
for(let cluster of iterate(template.clusters)){
cluster = JSON.parse(cluster)
set("clusterName", cluster.name)
set("location", cluster.location)
code(3)
for(let nodePool of iterate(template.nodePools)){
nodePool = JSON.parse(nodePool)
set("nodePoolName", nodePool.name)
code(4)
}
}
}
self-contained: true
code:
- engine:
- sh
- bash
source: |
gcloud projects list --format="json(projectId)"
extractors:
- type: json
name: projectIds
internal: true
json:
- '.[].projectId'
- engine:
- sh
- bash
source: |
gcloud container clusters list --project $projectId --format="json(name,location)"
extractors:
- type: json
name: clusters
internal: true
json:
- '.[]'
- engine:
- sh
- bash
source: |
gcloud container node-pools list --cluster $clusterName --location $location --project $projectId --format="json(name)"
extractors:
- type: json
name: nodePools
internal: true
json:
- '.[]'
- engine:
- sh
- bash
source: |
gcloud container node-pools describe $nodePoolName --cluster $clusterName --location $location --project $projectId --format="yaml(management.autoRepair)"
matchers:
- type: word
words:
- "management: {}"
- "null"
condition: or
extractors:
- type: dsl
dsl:
- '"Node pool " + nodePoolName + " in GKE cluster " + clusterName + " (" + location + ") of project " + projectId + " does not have auto-repair enabled"'
# digest: 4a0a004730450220650dfbe4698de5fd0774f8caaada58ef598cf8c58b89fb2bbbc4ac82e9f6c0c902210096aaf12a7d2fdab29352f5ef9d832c8fb67a20a2b9308c668307b5d13e6130bb:922c64590222798bb761d5b6d8e72950