Skip to main content

Determine resilience score

The resilience score is a quantitative measure obtained when you run a chaos experiment. This score represents how resilient the target environment is when you run that chaos experiment on it.

The score is calculated based on:

  • The weight you give each fault in the experiment.
  • The success rate of the probes in each fault.

This topic explains these elements, and gives an example resilience calculation.

Fault weight

While creating a chaos experiment, you can assign a weight between 1 - 10 to each fault. This represents the priority/importance of the respective fault. The higher the weight, the more significant the fault is.

For example:

  • Low Priority: 0 - 3
  • Medium Priority: 4 - 6
  • High Priority: 7 - 10

Success rate of probes in each fault

The probe success percentage for a fault is the ratio of successful probes to total probes. For example, if a fault has 4 probes and only 2 of them are successful, then the probe success percentage for this fault is 50%.

Resilience calculation

Based on fault weights and probe success rates, you can calculate two types of resilience score (represented as a percentage):

  • A fault's resilience = fault weight * probe success percentage
  • The experiment's total resilience = sum of all fault resilience / sum of all fault weights of the experiments

Here's an example:

  • Experiment A runs, and includes 3 faults. Fault weights, number of probes, and probe success rates are as follows.

    FaultWeightNumber
    of probes
    Probes
    succeeded
    Fault
    resilience
    Fault1210 (or 0%)0%
    Fault2422 (or 100%)400%
    Fault3843 (or 75%)600%
    Sum: 14Sum: 1000%

  • Experiment A's total resilience score

    Divide the sum of all fault resilience by the sum of all fault weights:

    1000% / 14 = 71%