Kubelet service kill
Kubelet service kill makes the application unreachable on the account of the node turning unschedulable (in NotReady state).
- Kubelet service is stopped (or killed) on a node to make it unschedulable for a specific duration.
- The application node goes back to normal state and services are resumed after the chaos duration.
Use cases
Kubelet service kill fault determines the resilience of an application when a node becomes unschedulable, that is, NotReady state.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: kubelet-service-kill
spec:
definition:
scope: Namespaced # Supports Cluster mode too
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["get", "list", "create"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list"]
Prerequisites
- Kubernetes > 1.16 is required to execute this fault.
- Node specified in the
TARGET_NODE
environment variable (the node for which Docker service would be killed) should be cordoned before executing the chaos fault. This ensures that the fault resources are not scheduled on it or subject to eviction. This is achieved using the following steps:- Get node names against the applications pods using command
kubectl get pods -o wide
. - Cordon the node using command
kubectl cordon <nodename>
.
- Get node names against the applications pods using command
- The target nodes should be in the ready state before and after injecting chaos.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
TARGET_NODE | Name of the target node. | For example, node-1 . For more information, go to target node. |
NODE_LABEL | Node label used to filter the target node if TARGET_NODE environment variable is not set. | It is mutually exclusive with the TARGET_NODE environment variable. If both are provided, the fault uses TARGET_NODE . For more information, go to node label. |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration that you specify, through which chaos is injected into the target resource (in seconds). | Default: 60 s. For more information, go to duration of the chaos. |
LIB_IMAGE | Image used to inject chaos. | Default: chaosnative/chaos-go-runner:main-latest . For more information, go to image used by the helper pod. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time. |
Kill kubelet service
Name of the target node. Tune it by using the TARGET_NODE
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# kill the kubelet service of the target node
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
chaosServiceAccount: litmus-admin
experiments:
- name: kubelet-service-kill
spec:
components:
env:
# name of the target node
- name: TARGET_NODE
value: 'node01'
- name: TOTAL_CHAOS_DURATION
VALUE: '60'