Driving network incident resolution with AI agents and LLMs

Reading time: 4 min

Operating the network is becoming more complex every day due to the huge number of elements and the massive generation of alarms. Thanks to the AI and the ML, is possible to correlate the information to obtain a common Root Cause and identify quicker the origin of the issue. Also, thanks to the LLM’s and chatbots is possible to interact with the network and make questions to obtain help for the alarms and incidents analysis.

In the context of the ANJ Program, that aims to implement autonomy and automation of the networks, in Telefonica some initiatives are being implementing based on LLM’s and Gen-AI to obtain information that the operator needs to analyze and resolve an issue.

AI in Operation Process: Agents for incident analysis

The usage of LLM based on agents, that act like expert in some knowledge area, make possible the implementation of operational chatbots that the operators can use to interact with the networks using natural language and obtain the needed information to analyse or resolve incidents. This makes more agile the operations process, by analysing all the information using different techniques (e.g. RAG, vectorization, etc), and giving a quick and precise answer.

By using chatbots in operation, is possible to obtain detailed information about an issue, if it is related to other existing incidents, details about previous resolutions of similar failures, etc. It is expected that with the integration of orchestrators with agents, it will be possible to execute intents and closed-loop actions directly over the network. Additionally, by using AI to perform analysis, it helps to efficiently identify and find the root cause of the incident.

Use Case in Germany: Troubleshooting Agent for analysis of incidences

In Telefonica Germany, they are developing several initiatives based on LLM and AI-agents for incident analysis that can help to the NOC operators to analyze and resolve the incidence.

Within this topic, in Telefonica Germany have developed the Network Operation Agent (NOA). Thanks to this initiative, the operators can make queries about the incident and make troubleshooting. Is possible to review the information from a database, documentation and even from the resolution of previous incidents. It is integrated with the ticketing system and obtain information about the resolution of closed issues.

“NOA is not just a classic RAG‑based chatbot solution. It is an application that combines reasoning mechanisms with direct connections to live operational data and knowledge documents enabling a much deeper understanding of network situations. From the beginning, our success criteria were not purely technology‑driven — they relied on close, horizontal collaboration between Cloud experts, Data Protection officers, Security specialists, and, most importantly, the end users themselves. This cross‑functional approach was essential to building an agent that truly works in real network operations.”

Niklas Schleßmann,  Group Lead Automation & Reporting of Telefonica Germany

This environment, designed to generate natural language answers based on the user questions by using Gen-AI and agents, will evolve through an autopilot that will allow to implement automations and closed loops for the operation phases. It improves and transform the capabilities of the NOC engineers when it is needed to troubleshoot an issue.

Using LLMs Assistants bring benefits to the operational processes

Some of the benefits obtained thanks to implement these initiatives are:

  • Increase of the productivity of technicians and engineers by implementing a Generative AI solution and optimizing the resources.
  • Optimize the analysis activities by identifying inter-domain Root Cause Analysis of an issue, correlating several information in real time that affect services, users or network elements.
  • Reduce times from the operational process like Mean Time to Detect (MTTD), Mean Time to Recover (MTTR) and Mean Time to Next Action (MTNA).
  • Accelerate and improve the knowledge transfer between teams and units and enhance the training process of the NOC Engineers.
  • Improve the levels of Intelligence and Autonomy of the network, and approaching to Level 4 of autonomy defined by TM Forum

“The adoption of AI Agents and chatbots in initiatives such as NOA, where artificial intelligence is applied to critical processes like incident analysis, represents a significant step forward in advancing network autonomy for Telefonica. These capabilities bring us notably closer to achieving Level 4 Autonomous Network operations, as defined by TM Forum. This progress enables the continued scaling of network transformation efforts and network growth while maximizing business value, enhancing NOC operational efficiency, and strengthening overall service quality.” Nilmar Seccomandi, Director of Autonomous Network and Infrastructure of Telefónica S.A.

With initiatives like these, Telefonica Germany is facing the network operation of the future, by transforming the ways of working and the processes implementing AI and LLM to enhance and optimize operational processes. This use case approaches to get the Level 4 of Autonomy defined as company target for 2030. Also, provides an increase of Intelligence and Autonomy that allows to face the network challenges in the telecom industry and the new needs that the users demand.

Share it on your social networks


Communication

Contact our communication department or requests additional material.