Partner Features: As mobile operators expand their 5G footprint and gradually upgrade to 3GPP releases 18 and 5G Advanced, it is essential that they move beyond traditional network operational practices to data-driven operation (AIOPS).
Interest in AIOPS goes far beyond the mobile sector. Moder Intelligence forecasts from October 2024 suggest that the AIOPS market could grow across it, potentially growing to $799.1 billion from $27.24 billion in 2024.
Today's 5G networks generate huge amounts of data. A network of 10 million subscribers can easily generate up to nine petabytes of data per day. Observability solutions can present operators with a potentially overwhelming amount of data. Historically, much of this data is not usually collected in real time and is often not provided in standard or open formats that are easily consumed. After being deposited in a large data lake, it often takes a considerable amount of time and effort to assemble the relevant data and carry out meaningful analysis.
In raw form, packet data, and even CDRs, can be granular, unstructured and noisy for immediate AI processing. AI can be very good at analyzing large amounts of data, but the values of the analysis are entirely dependent on the quality of the input data selected. Using AI to prepare huge amounts of data can be extremely expensive.
Conversion with source
If it truly revolutionizes operations, operators deploy smart monitoring, data curation, and pipelines to enable cost-effective, real-time, scalable, and intelligent AI automation across the RAN, core, transport, and MEC layers.
Transformation pipelines should be used to normalize, enrich and label data at its source to reduce noise and improve the quality of input into the AIOPS environment. Pipelines can be configured to ensure compliance by anonymizing sensitive fields.
Telecom domain expertise is essential when data from different sources is contextualized with meaningful, actionable intelligence. Related data must be identified and collected, cleaned, validated, and labeled for model training. Human experts also need to design and train AI models and provide high quality feedback to correct mistakes.
Tokenization: From raw to curation
A single RAW 5G event record can contain up to 180 tokens. Thrucuration can be reduced to approximately 25 tokens (around 85% reduction) in order to dramatically reduce GPU usage and processing costs, especially in public cloud environments such as AWS Bedlock. Curation not only reduces calculation usage, but also reduces data lake storage requirements for more accurate results and redundant data.
Once pipelined, curated data can be combined with data from other sources such as subscriber demographics, cloud infrastructure metrics, geospatial/environmental data, and even subscriber-based social media analytics. . With the required domain intelligence, the algorithm can be fed into a comprehensive overview and AIOPS to provide valuable insights that can drive enhanced network performance and user experience.
Packet-level accuracy
You must extract curated data for a particular use case. Deep Packet Inspection (DPI), where sensors inspect the actual data payload in network traffic, can show exactly what was sent, when it occurred, and how the system across the stack responds. By enriching control plane metadata and identifiers such as IMSI/SUPI, packet data can be provided with metrics at the per-cell, slice, handset, or subscriber level. This allows operators to train their AI systems with an accurate understanding of the network behavior associated with each subscriber.
Curated data feeds low-value data and enhanced subscriber insights on AIOPS pipelines that could uncover monetization opportunities by driving NPS improvements or strengthening SLA management. A single curated feed is one hundredth of the size of the underlying raw data, but holds the maximum analysis value.
Reduce AI costs
Netscout's Omnis AI streamer was developed to provide curated high fidelity metadata derived from packet flow. It can be used to identify and correlate observability trends, streamline and automate data analysis, uncover historical operational patterns, and detect unexpected issues and security risks that could lead to future service outages or data breaches. Data streams can be tailored to a variety of use cases.
In operation, AI streamers have been shown to achieve significant reductions in data volume, reduced GPU memory and processing time, and increased throughput from fewer GPU instances. It can support a number of use cases, including network optimization. Predictive Maintenance; Real-time slice analysis and digital twin applications.
The flexible feed configuration provided by tools such as Omnis AI Streamer allows operators to define feed “playbooks”, scheduling intervals, critical metrics, dimensions, and filters, and the required curated data is sent to the AIOPS engine.
For example, the QUIC Transport Protocol Latency metric can be particularly aggregated to monitor and manage the performance of dedicated 5G slices for premium YouTube users. The problem is that you can create a focus dataset for accurate troubleshooting, tailored to specific network parameters such as cells and nodes.
User plane data is another source of valuable input data for AIOPS. Key data elements TEID, QOS flow ID, IP address, latency, and application signatures can be applied to traffic pattern analysis, SLA breaches detection, QOE estimation, and app-level performance monitoring.
Conclusion
High quality, high value, low volume data curated data is essential for AIOPS' success. High-quality analyses and filtering by sources allow us to provide AIOPS-enabled “Gold Standard” curated data. The curated data approach is also consistent with the TM Forum's autonomous network initiative and new use cases and revenue streams that unlock operators. With improved service quality, predictive maintenance and increased security, curated data opens the door to true monetization.

