Node memory hog
Node memory hog causes memory resource exhaustion on the Kubernetes node.
- It is injected using a helper pod running the Linux stress-ng tool.
- The chaos affects the application for a specific duration.

Use cases
Node memory hog fault:
- Verifies application restarts on OOM kills.
- Verifies resilience of applications whose replicas may be evicted on account on nodes becoming unschedulable (in NotReady state) due to lack of memory resources.
- Simulates the situation of memory leaks in the deployment of microservices.
- Simulates application slowness due to memory starvation.
- Simulates noisy neighbour problems due to hogging.
- Verifies pod priority and QoS setting for eviction purposes.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: hce
  name: node-memory-hog
spec:
  definition:
    scope: Cluster
permissions:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "get", "list", "patch", "update"]
  - apiGroups: [""]
    resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
    verbs: ["create", "delete", "get", "list", "patch", "update"]
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["get", "list", "create"]
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["create", "delete", "get", "list", "deletecollection"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list"]
Prerequisites
- Kubernetes > 1.16
- The target nodes should be in the ready state before and after injecting chaos.
Mandatory tunables
| Tunable | Description | Notes | 
|---|---|---|
| TARGET_NODES | Comma-separated list of nodes subject to node I/O stress. | For example, node-1,node-2. For more information, go to target nodes. | 
| NODE_LABEL | It contains the node label that is used to filter the target nodes. | It is mutually exclusive with the TARGET_NODESenvironment variable. If both are provided,TARGET_NODEStakes precedence. For more information, go to target nodes with labels. | 
Optional tunables
| Tunable | Description | Notes | 
|---|---|---|
| TOTAL_CHAOS_DURATION | Duration that you specify, through which chaos is injected into the target resource (in seconds). | Default: 120 s. For more information, go to duration of the chaos. | 
| LIB_IMAGE | Image used to run the stress command. | Default: harness/chaos-go-runner:main-latest. For more information, go to image used by the helper pod. | 
| MEMORY_CONSUMPTION_PERCENTAGE | Percent of the total node memory capacity. | Default: 30. For more information, go to memory consumption percentage. | 
| MEMORY_CONSUMPTION_MEBIBYTES | Amount of the total available memory (in mebibytes). It is mutually exclusive with MEMORY_CONSUMPTION_PERCENTAGE. | For example, 256. For more information, go to memory consumption bytes. | 
| NUMBER_OF_WORKERS | Number of VM workers involved in the stress. | Default: 1. For more information, go to workers for stress. | 
| RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time. | 
| NODES_AFFECTED_PERC | Percentage of the total nodes to target. It takes numeric values only. | Default: 0 (corresponds to 1 node). For more information, go to node affected percentage. | 
| SEQUENCE | Sequence of chaos execution for multiple target pods. | Default: parallel. Supports serial sequence as well. For more information, go to sequence of chaos execution. | 
Memory consumption percentage
Memory consumed (in percentage). Tune it by using the MEMORY_CONSUMPTION_PERCENTAGE environment variable.
The following YAML snippet illustrates the use of this environment variable:
# stress the memory of the targeted node with MEMORY_CONSUMPTION_PERCENTAGE of node capacity
# it is mutually exclusive with the MEMORY_CONSUMPTION_MEBIBYTES.
# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  annotationCheck: "false"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: node-memory-hog
    spec:
      components:
        env:
        # percentage of total node capacity to be stressed
        - name: MEMORY_CONSUMPTION_PERCENTAGE
          value: '10' # in percentage
        - name: TOTAL_CHAOS_DURATION
          VALUE: '60'
Memory consumption mebibytes
Memory available (in mebibytes). Tune it by using the MEMORY_CONSUMPTION_MEBIBYTES environment variable. It is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE environment variable. If MEMORY_CONSUMPTION_PERCENTAGE environment variable is set, the fault uses this value for the stress.
The following YAML snippet illustrates the use of this environment variable:
# stress the memory of the targeted node with given MEMORY_CONSUMPTION_MEBIBYTES
# it is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE.
# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  annotationCheck: "false"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: node-memory-hog
    spec:
      components:
        env:
        # node memory to be stressed
        - name: MEMORY_CONSUMPTION_MEBIBYTES
          value: '500' # in MiBi
        - name: TOTAL_CHAOS_DURATION
          VALUE: '60'
Workers for stress
Number of workers for stress. Tune it by using the NUMBER_OF_WORKERS environment variable.
The following YAML snippet illustrates the use of this environment variable:
# provide for the workers count for the stress
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  annotationCheck: "false"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: node-memory-hog
    spec:
      components:
        env:
        # total number of workers involved in stress
        - name: NUMBER_OF_WORKERS
          value: '1'
        - name: TOTAL_CHAOS_DURATION
          VALUE: '60'