Skip to main content

Robust conformance testing framework is a must for AI-native cellular systems

Robust conformance testing framework is a must for AI-native cellular systems

It is fascinating to be actively involved in introducing AI in the wireless communications domain. In the past two years, we have been laying the foundation in 3GPP for introducing standardized AI solutions in the radio interface based on three concrete use cases, i.e., beam prediction, channel state information (CSI) feedback enhancements, and AI-based positioning. These use cases are categorized into one-sided or two-sided solutions, as described in Figure 1.

Figure 1

Figure 1: One-sided and two-sided AI solutions in the mobile network air interface

In the data-driven world, the devices and networks get smarter and adapt to the radio environment by collecting more data and learning from it. With all the potential and continuous evolution of the algorithms unleashed by AI solutions, we also need to be mindful of ensuring predictable performance levels and behavior of the equipment supporting these new features. Since conventional deterministic algorithms are different in behavior from the stochastic AI features, robust and future-proof conformance testing is pivotal to the adoption of AI in commercial networks.

At Nokia, we are committed to responsible AI. Our ambition is to navigate the so-called Control Paradox of applying AI appropriately and wisely to ensure confidence in future wireless networks. This requires the development of a novel, scalable and robust testing framework, while the AI technology itself is evolving rapidly.

Reliability starts with testability

While a mobile app animating old photos may not be expected to perform flawlessly, a sudden connection break in critical infrastructure can have severe consequences. Ensuring specified behavior and predictable performance of thousands of devices in ever-changing radio conditions, while undeniably challenging, is crucial for device and network vendors and mobile network operators.

Testability is one of the pillars of predictable network behavior, especially when it comes to air-interface solutions, inter-vendor interoperability, and the device ecosystem. Importantly, in the 3GPP standardization process, the introduction of any new radio feature is not complete until the requirements and test cases are defined for the products supporting it. Current testing methods are designed with deterministic analytical algorithms in focus, yet they are inadequate for addressing the adaptive and stochastic nature of AI. Even if a feature conforms to the current requirements, this may not guarantee that, in the field, it will make a reasonably efficient use of AI.

The enablers for robust AI air-interface solutions are testing and validation, monitoring and management, updates and adaptation in an automated manner. While this cycle is almost non-existent (and not needed) for traditional non-AI features, it will be required for upcoming 5G-Advanced use cases. In 6G, AI is expected to accelerate with the introduction of new 6G scenarios relying on more frequent adaptation, updates and localization of the underlying AI models. In Figure 2, we highlight the role of a full AI cycle testing approach for trusted and reliable AI air interface solutions.

Figure 2

Figure 2. Stages of AI feature lifecycle management

AI testability challenges in 3GPP

Equipment manufacturers must follow many rules and regulations when bringing mobile wireless communications products to market. 3GPP-based conformance tests are the key component of the framework required from equipment vendors before a device is released into the field.

RAN4 and RAN5 are the working groups within 3GPP that are responsible for the design of requirements and tests. Conformance testing guarantees that the terminals and network equipment implement procedures and protocols according to the standard and that a specific minimum level of performance is achieved in reproducible conditions.

In 5G-Advanced, we cannot expect major changes yet in the existing 3GPP testing framework when introducing new AI-based features. But 3GPP has already identified several new aspects during a Release-18 study that must be treated in a Release-19 work item and future releases, as we show in Figure 3:

  1. Generalization and test coverage
  2. Management and monitoring
  3. Post-deployment handling
  4. Upgrade of the testing setup

Nokia will lead the study of new requirements and testability for Release-19 AI mobility, which will start in October 2024. 

Figure 3

Figure 3: Integration of AI features into RAN4 requirements and testing framework

Generalization and test coverage

In machine learning, generalization refers to the ability of a model to adapt to new, previously unseen data. In wireless communication, this translates to the device's ability to maintain minimal performance in various network configurations, changing radio conditions, and new environments not considered explicitly during training.

To tackle generalization, we believe it is crucial to implement new requirements:

  1. Specify what level of degradation or drift could be allowed if the model is generalizing to the conditions that are atypical or different from those used in training

  2. Verify the model and functionality robustness with respect to dynamically changing radio conditions and network-device configurations during the test

For example, testing could ensure that the quality of beam predictions provided by the terminal does not decrease when different antenna panel configurations are implemented at the network side, or when the number of beams transmitted and measured for prediction changes. It is also important to test that AI-based CSI predictions follow the radio environment accurately when radio conditions vary or the type and parameters of the channel model in the test change.

Management and monitoring

Life cycle management (LCM) introduces new operations to provision AI air interface features. Most of the LCM procedures are specific to the implementation, but the signaling elements and device behavior in response to LCM commands need to be tested. For example, we expect latency of model and functionality activation and switching to be specified. Additionally, the latency of inference result reporting in each use case should be within specific limits to avoid negative impacts on performance.

In 5G-Advanced, AI features are introduced as enhancements over the existing non-AI ones. Therefore, a performance monitoring mechanism allowing fallback to an alternative or a legacy solution is crucial in case of a temporarily malfunctioning AI feature. Essentially, this mechanism should follow well-defined requirements, i.e., it should be specified how long it might take to identify a problem and to apply corrective action such as disabling or switching between AI functionalities.

Monitoring should be based on clearly defined and verifiable performance metrics. For example, the best beams can be reported together with their intrinsic probabilities. Alternatively, periodic comparisons of predictions with the ground truth, such as actual beam measurements, could be used for monitoring the AI feature. Thus, in addition to the latency of procedures, the accuracy of performance metrics would be testable.

Post-deployment handling

Updates to the AI-enabled solutions are much more likely than for the legacy algorithms at least because of their dependency on the training data. From time to time, there will be a need to accommodate changes in the radio environment, for example, to avoid model drift or to enhance the performance and/or power consumption due to advances in the architecture of the models. We cannot expect that the models will stay the same during the whole lifespan of devices.

The challenge is that the devices that have passed the conformance tests with one AI model implementation will already be present in the field. Therefore, a mechanism for post-deployment validation should be introduced. It could be based either on thorough testing of the updated models before they are rolled out to the devices — an almost impossible task — or on on-device validation procedures that ensure that only verified AI-enabled algorithms can be activated in the device.

Upgrade of the testing setup

Finally, the testing setups themselves may introduce significant complexity. For example, beam prediction at the user equipment needs to be tested in realistic scenarios with many beams generated in the test chamber. In legacy tests, only a few beams are sufficient while, for an AI model, up to 64 beams may be needed. In the case of AI-based CSI compression, the main challenge is the interoperability of encoders implemented by different device vendors and the test decoder deployed in the test equipment, which might cause even good encoder design to fail. On the other hand, testing with oversimplified or known models might not reflect performance achieved in real deployments. Nokia is working on approaches for full or partial specification of such test decoders for the CSI compression use case in 5G-Advanced, which will be used as a starting point for testing two-sided AI solutions in 6G.

Towards reliable AI in 6G

The requirements and testing framework for the 5G-Advanced AI-powered air interface is the first important step that will ensure the reliable and trusted implementation of AI in the AI-native air interface in the upcoming 6G.

6G will bring new testing and interoperability challenges for AI use cases, such as optimized mobility, random access, adaptive modulation and coding, and power control. It is paramount to have from the start a scalable and consistent testing framework designed for the new sustainable 6G AI solutions, built on the expertise gained during 5G-Advanced. This new testing framework has to be future-proof and support all upcoming releases of 6G. At Nokia, we are diligently working to be ready for this future by driving the enablers for robust testing and validation of AI air-interface solutions in 3GPP.

Dr. Dimitri  Gold

About Dr. Dimitri Gold

Dr. Dimitri Gold is a Senior Staff Standardization Specialist at Nokia Standards in Espoo, Finland. He is the lead 3GPP delegate, working in RAN4 WG and currently focusing on conformance testing of new AI/ML features.  Dimitri has over 15 years of experience in wireless network research and development, (co-)author of numerous academic publications and inventions.

Connect with Dimitri on LinkedIn

Fahad Syed Muhammad

About Fahad Syed Muhammad

Fahad Syed Muhammad is a Senior Staff Standardization Specialist at Nokia Standards in Paris, France. He has several years of experience in wireless communication on aspects related to both product development and standardization. His focus areas include AI/ML, radio resource management and radio performance optimization.

Connect with Fahad on LinkedIn

István Z. Kovács

About István Z. Kovács

István Z. Kovács, Senior Research Engineer, currently works on standardisation for machine learning-driven radio-resource management and radio-connectivity enhancements for 5G and 6G systems.

Connect with István on LinkedIn

Article tags