Pod HTTP reset peer
Pod HTTP reset peer is a Kubernetes pod-level chaos fault that injects chaos on the service whose port is specified using the TARGET_SERVICE_PORT
environment variable. This fault:
- Stops the outgoing HTTP requests by resetting the TCP connection.
- This is achieved by starting the proxy server and redirecting the traffic through the proxy server.
Use cases
Pod HTTP reset peer:
- Tests the application's resilience to lossy or flaky HTTP connection.
- Simulates premature connection loss that may occur due to firewall issues or other issues between microservices thereby verifying connection timeout.
- Simulates connection resets due to resource limitations on the server side such as out of memory error, process kills, overload on the server due to high amounts of traffic.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: pod-http-reset-peer
spec:
definition:
scope: Cluster # Supports "Namespaced" mode too
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["deployments, statefulsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["replicasets, daemonsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
Prerequisites
- Kubernetes > 1.16
- The application pods should be in the running state before and after injecting chaos.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
TARGET_SERVICE_PORT | Port of the service to target. | Default: port 80. For more information, go to target service port |
NODE_LABEL | Node label used to filter the target node if TARGET_NODE environment variable is not set. | It is mutually exclusive with the TARGET_NODE environment variable. If both are provided, the fault uses TARGET_NODE . For more information, go to node label. |
RESET_TIMEOUT | Specifies the duration after which connection needs to be reset. | Default: 0. For more information, go to reset timeout |
Optional tunables
Tunable | Description | Notes |
---|---|---|
PROXY_PORT | Port where the proxy listens for requests. | Default: 20000. For more information, go to proxy port |
NETWORK_INTERFACE | Network interface used for the proxy. | Default: eth0 . For more information, go to network interface |
TOXICITY | Percentage of HTTP requests to be affected. | Default: 100. For more information, go to toxicity |
CONTAINER_RUNTIME | Container runtime interface for the cluster | Default: containerd. Supports values: docker, containerd and crio. For more information, go to container runtime |
SOCKET_PATH | Path of the containerd or crio or docker socket file. | Default: /run/containerd/containerd.sock . For more information, go to socket path |
TOTAL_CHAOS_DURATION | Duration for which to insert chaos (in seconds). | Default: 60 s. For more information, go to duration of the chaos |
TARGET_PODS | Comma-separated list of application pod names subject to pod HTTP reset peer. | If not provided, the fault selects target pods randomly based on provided appLabels. For more information, go to target specific pods |
PODS_AFFECTED_PERC | Percentage of total pods to target. Provide numeric values. | Default: 0 (corresponds to 1 replica). For more information, go to pod affected percentage |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time |
SEQUENCE | Sequence of chaos execution for multiple target pods. | Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution |
Target service port
Port of the target service. Tune it by using the TARGET_SERVICE_PORT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the port of the targeted service
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Proxy port
Port where the proxy server listens for requests. Tune it by using the PROXY_PORT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the port for proxy server
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# provide the port for proxy server
- name: PROXY_PORT
value: "8080"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Reset timeout
Duration after which connection is reset. Tune it by using the RESET_TIMEOUT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the reset timeout value
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# reset timeout specifies after how much duration to reset the connection
- name: RESET_TIMEOUT #in ms
value: "2000"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Toxicity
Percentage of the total number of HTTP requests that need to be affected. Tune it by using the TOXICITY
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the toxicity
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# toxicity is the probability of the request to be affected
# provide the percentage value in the range of 0-100
# 0 means no request will be affected and 100 means all request will be affected
- name: TOXICITY
value: "100"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Network interface
Network interface used for the proxy. Tune it by using the NETWORK_INTERFACE
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the network interface for proxy
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# provide the network interface for proxy
- name: NETWORK_INTERFACE
value: "eth0"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Container runtime and socket path
The CONTAINER_RUNTIME
and SOCKET_PATH
environment variable to set the container runtime and socket file path, respectively.
CONTAINER_RUNTIME
: It supportsdocker
,containerd
, andcrio
runtimes. The default value iscontainerd
.SOCKET_PATH
: It contains path of containerd socket file by default(/run/containerd/containerd.sock
). Fordocker
, specify path as/var/run/docker.sock
. Forcrio
, specify path as/var/run/crio/crio.sock
.
The following YAML snippet illustrates the use of these environment variables:
## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-http-reset-peer
spec:
components:
env:
# runtime for the container
# supports docker, containerd, crio
- name: CONTAINER_RUNTIME
value: "containerd"
# path of the socket file
- name: SOCKET_PATH
value: "/run/containerd/containerd.sock"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"