Every month brings headlines about another traffic misrouting. Just this November, we read about mysterious routing of domestic US internet traffic through China and Google services that went down for more than an hour due as their traffic was mistakenly rerouted. These and other issues are keeping service providers busy responding to the complaints of their subscribers, whether it’s gamers complaining about lag or customers suffering from video streaming interruptions. Trillions of dollars of emerging new business now depend on the internet. We need to get smarter about how we operate and manage it.
One of the murkier but critical areas of the internet is peering. A normal path, from send to receive, might pass over 10 separately owned and operated networks. The points where these networks intersect are called peering points. Traditionally, the traffic flows between networks at peering points have been fairly symmetric and predictable. Messaging, web browsing, social media and e-commerce are all relatively light and predictable.
Streaming video is very lumpy and bandwidth-intensive. Netflix, which accounts for 15 percent of global internet traffic today, can by itself disrupt traffic flows with the release of a single, highly anticipated episode or series. Similar things happen when Epic releases an update to Fortnite, or a cat video goes viral on YouTube. The other big disruptor, of course, are security events such as DDoS, which are the most difficult to predict.
Peering points can be overwhelmed by these very large, volatile flows. This creates congestion, dropped packets and poor QoE for subscribers. End users will always blame their internet service provider for it and pick up their phone to call the hotline.
In response, service providers manually move traffic to less congested peering points. Unfortunately, this has proven over and over again to be both too slow and error prone; some of the biggest internet incidents in recent years have been caused by manual misconfiguration errors.
The first step in addressing these issues is to automate the re-direction of traffic between peering points when possible. In order to automate, however, service providers need analytics with better insights into traffic. Traditionally, they have employed a basic approach to analyze traffic, using source and destination IP addresses. However, to deal with today’s traffic reality and achieve a better peering engineering, they need to understand the type of traffic flowing to, from and through their network. Detailed end-to-end analytics that provides the end-to-end context is essential for understanding which applications or services are impacted or causing the disruption. Combined with automation, it leads to a more effective operation of the peering network compared to today’s approach.
The stakes are only getting higher. Gamers and video streamers today, IoT networks and autonomous cars tomorrow. Increasingly mission-critical processes are making their way onto public and hybrid private-public networks. Turning the internet into a mission-critical, dynamic and globally scalable network will require, among many other changes, better analytics for insight-driven and automated steering of traffic flows, which will enable better QoE through dynamic traffic engineering at peering routers and network interconnections.