What a good continuity strategies for cloud services should look like

Know continuity strategies for cloud services

Picture of Pedro Morales

Pedro Morales Follow

Reading time: 2 min

Redundancy

With redundancy, we can have data replicated in one or more locations. The general levels of redundancy offered by public clouds are:

  • Local: This usually comes as standard in almost all services. Several copies are stored within the same data centre but in different storage cabinets.
  • Zonal: Several copies are stored in different data centres in a specific area. The distance between these data centres is usually between 20-30 kilometres.
  • Regional: Several copies are stored in different data centres within a region. The distance between data centres here is greater. For example, in Spain, you might have one copy in Madrid and another in Barcelona.
  • Geographic: Several copies are stored in different data centres in different regions. For example, you might have one copy in Madrid (Spain) and another in Amsterdam (Netherlands).

In terms of costs, Local is cheaper and Geographical is more expensive. Some clouds allow combinations of the two.

Disaster Recovery Strategy

We can apply a wide range of DR (Disaster Recovery) strategies, in which we can choose whether we want a low-cost strategy (usually with shortcomings) or a resilient and complex (more expensive) strategy. We must assess each case individually, based on the type of service, compliance with committed SLAs (service level agreements) or the importance of the data to be stored.

We must not confuse DR with HA (High Availability). In DR, computing is only carried out in the main data centre, and what is replicated is the storage outside that data centre, either in another area or in another region.

The most economical would be zone-level DR, i.e., we would only have backup if the main data centre goes down, and service would start to be provided from there. The most expensive would be geo-redundant.

In DR, we must take into account the requirements of our service in terms of RTO and RPO:

  • RTO (Restore Time Objective): This is the maximum time that the service can be down without causing a significant impact on the organisation. For example, if the RTO is set to 1 hour, this is the time in which the company can get back up and running without major losses.
  • RPO (Restore Point Objective): This is the maximum amount of data that a company could lose in the event of a catastrophic event without suffering significant damage. For example, if the RPO is 2 hours, it means that the company could afford to lose the data from the last 2 hours without any significant impact.

Multicloud

This is always mentioned, as we consider it to be important both in daily operations and in this type of strategy. Having redundancy between cloud providers greatly increases the resilience of our company’s services. DR strategies can be combined at the multicloud level, although at the moment these techniques face a major challenge, which is the orchestration between providers to perform failovers and failbacks.

Conclusion

Currently, no service guarantees 100% availability. However, it is within our power to significantly increase that percentage, always making economic and effort assessments when necessary for the service we provide.

Share it on your social networks


Communication

Contact our communication department or requests additional material.