One more step towards autonomous networks: self-healing in Brazil’s 5G Core

Today, most of the world's 5G networks – including Telefónica's – operate on NFV/ Network Function Virtualization-based architectures. This approach separates network functions from the underlying physical infrastructure (compute, storage, and networking), generating a virtualized layer in which these functions run as virtual machines (VMs) or containers.

Reading time: 3 min

Virtualization offers great benefits in terms of flexibility, scalability, and efficiency, allowing network functions to be dynamically managed, orchestrated, and scaled on demand and in strategic locations.

However, it also poses new operational challenges. Virtualized network functions are still dependent on the performance of the physical layer, and any degradation directly impacts the quality of service. Having real-time visibility into the health of the physical environment and its relationship to logical functions is key to ensuring network availability and reliability.

This process requires a high level of technique, time and resources. In dynamic environments such as Core 5G, manual response is insufficient. Therefore, there is a need for intelligent and autonomous mechanisms that allow incidents to be detected and resolved in real time, without human intervention.

In this context, the VIVO team, within the initiatives if Telefónica’s Autonomous Network Journey (ANJ) program, identified the opportunity to apply closed-loop and intent-based  capabilities to address a critical problem: the automatic detection and resolution of incidents originating in the physical layer of the virtualized 5G Core.

Closed-loop and Intent: intelligence for autonomy

A truly autonomous network needs to operate through closed-loop, automatic cycles that range from observation to the execution of actions without human intervention. This approach turns the network into a dynamic environment, driven by data, analytics, and artificial intelligence.

Added to this is the operation based on “Intent“, which allows systems to understand the purpose of each action, beyond fixed instructions. Algorithms can this way evaluate contexts, make complex decisions, and act proactively and intelligently.

The solution: autonomous intelligence to operate and solve

VIVO’s Core team has implemented a standalone self-healing mechanism on top of the virtualized architecture of Core 5G SA. This AI-based solution, closed-loop, and intent, detects anomalies in network functions (NF) performance, diagnoses the root cause—often associated with virtual machine issues—and automatically executes the most appropriate action.

Thanks to the Intent-based approach, the system evaluates the context, decides, and acts. In addition, it correlates events between the logical and physical layers and verifies whether the applied action was effective or if it should be scaled to another level. The whole process is completely autonomous.

The algorithm operates on the physical layer at three levels of action, applying progressive strategies to automatically restore network functions according to the complexity of the failure.

The impact: more agility, lower costs, better network

This self-healing algorithm significantly improves the robustness, reliability, and availability of a critical layer such as the virtual machines that underpin 5G services. Its implementation has made it possible to reduce the average resolution time (MTTR) of this type of incident by 30 minutes, improving operational agility.

Marcelo Figlarz, Director of Network Operations at Vivo-Brazil, highlights the strategic value of artificial intelligence applied to operational processes:

“The intelligence that we are incorporating into our Operational Processes has the objective of streamlining and optimizing the work of our engineers. Initiatives such as the Self-Healing of the 5G Core, with this level of automation, capable of resolving faults completely autonomously, are essential to achieve our objectives of efficiency and operational quality.”

In addition, the system provides key visibility into network performance, facilitating strategic decisions on infrastructure investments. By optimizing computational resources, it also contributes to reducing operating costs.

Among the highlighted benefits are:

  • Increased reliability and availability, thanks to automation and reduced manual interventions.
  • Faster incident resolution, with real-time detection and correction.
  • Performance optimization, anticipating problems using AI algorithms.
  • Reduction of repetitive tasks, freeing up the operations team for higher-value tasks.

With this solution, VIVO achieves a level 4 of autonomy, with a network capable of acting, adapting and evolving intelligently, reducing errors, response times and costs, while improving the customer experience.

Share it on your social networks


Communication

Contact our communication department or requests additional material.