rust
/
Materialize


			
				
					
						
						
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
							columns:
  - column: "Failure Type"
  - column: "RPO"
  - column: "RTO (RF1 - single AZ)"
  - column: "RTO (RF2 - multiple AZs)"

rows:
  - Failure Type: "**Machine failure**"
    RPO: 0
    RTO (RF1 - single AZ): |
      Time to spin up new machine + possible rehydration time, depending on the
      objects on the machine:

      - If non-upsert sources, no rehydration time(i.e., does not require
        rehydration).
      - If upsert sources, rehydration time.
      - If sinks, no rehydration time (i.e., does not require rehydration).
      - If compute, rehydration time.
      - If serving, rehydration time.

      Additionally, there may be some time to catch up with changes that may
      have occurred during the downtime.

      To reduce rehydration time, scale up the cluster.
    RTO (RF2 - multiple AZs): |
      Can be:

      - 0 if only compute and serving objects are on the machine.

      - Time to spin up new machine if sources or sinks are on the machine.

      In addition, cluster RTO is affected if the [`environmentd` is
      down](#environmentd) (seconds to minutes).

  - Failure Type: "**Single AZ failure**"
    RPO: 0
    RTO (RF1 - single AZ): |

      *For managed clusters*

      Time to spin up new machine + possible rehydration time, depending on the
      objects on the machine:

      - If non-upsert sources, no rehydration time(i.e., does not require
        rehydration).
      - If upsert sources, rehydration time.
      - If sinks, no rehydration time (i.e., does not require rehydration).
      - If compute, rehydration time.
      - If serving, rehydration time.

      Additionally, there may be some time to catch up with changes that may
      have occurred during the downtime.

      To reduce rehydration time, you can scale up the cluster.

      During downtime, single AZ PrivateLinks are impacted.
    RTO (RF2 - multiple AZs): |

      Can be:

      - 0 if only compute and serving objects are on the machine.

      - Time to spin up new machine if sources or sinks are on the machine.

      In addition, cluster RTO is affected if the [`environmentd` is
      down](#environmentd) (seconds to minutes).

  - Failure Type: "**Regional failure (or 2 AZs failures)**"
    RPO: |
      At most, 1 hour (time since last backup, based on hourly backups).<br>

    RTO (RF1 - single AZ): |
      ~1 hour (time to check pointers).
    RTO (RF2 - multiple AZs): |
      High/Significant. Consider using a [regional failover strategy](/manage/disaster-recovery/#level-3-a-duplicate-materialize-environment-inter-region-resilience).