Knowledge Transfer in Deep Reinforcement Learning for Slice-Aware Mobility Robustness Optimization

28 October 2021

New Image

The legacy MRO in self organizing networks aims at improving handover performance metrics by optimizing cell-specific or cell-pair-specific handover parameters. However, such solutions cannot satisfy the needs of next-generation network with network slicing, because it only guarantees the continuation of the received signal strength but not the per-slice service quality. To provide the truly seamless mobility service, we propose a deep reinforcement learning-based SAMRO approach, which optimizes slice-aware handover performance by introducing the slice-specific handover parameters in a multi-cell scenario. Moreover, to avoid extensive online exploration, we develop a two-step transfer learning scheme: 1) regularized off-policy reinforcement learning, and 2) effective online fine-tuning with mixed experience replay. System level simulations show that compared against the legacy MRO algorithms, SAMRO significantly improves service continuation for all slices while optimizing the handover performance. Furthermore, the proposed transfer learning scheme enables effective online fine-tuning with faster training and higher sample efficiency.