Building IP networks for today’s internet
In the early decades of the internet, the scale and pace of traffic growth was almost overwhelming. It focused the entire industry on bandwidth. This led to what I call the seduction of big numbers. Vendors and operators convinced themselves that meeting the demand with faster processors and greater throughput was all that really mattered.
As long as best-effort networks had bandwidth to spare, there was no need to worry about a lack of traffic engineering and QoS. That is, until traffic and the need for reliable and predictable quality both grew, which led us to IP/MPLS. The internet has continued, with major inflections such as MPLS or Voice over IP, successfully inventing and integrating many new services.
Today the global network is increasingly business-critical and even mission-critical. With Industry 4.0 and 5G promising to digitally transform our most basic infrastructure, the existential question for all industries is: “Can I get the service that I want, with the quality and security I want, at the time I want it?” And the timeframe for decision-making is not months, but days, hours, minutes, seconds.
COVID-19 as prompt
As we examine the growth of the internet, we realize that the inexorable march of progress is based on understanding traffic models and controlling the introduction of services: in other words, progress on our terms. COVID offered a wakeup call of a different order, a singularity that prompts a rethink of how we build systems and networks. Indeed, we react now to COVID not just out of fear of a resurgence but as a foreshadow, one that will perpetuate a sustainable, secure, and scalable internet. One that remains resilient in the face of the unexpected.
Built for black swan events
We must start by accepting that we will never completely get a handle on the bigger internet. We’ll never be able to entirely predict what’s coming next. We must build resilience into our networks using principles like extensibility, agility and security. We need to focus on all three major pillars – silicon, systems, software – and wrap automation and analytics around them. Our goal is to master the unexpected, and sail effortlessly through the next black swan event, the next service that strains the imagination, the next cyber-assault that endeavors to disrupt.
Single line of software
At Nokia, we believe in write once, leverage many. Our flagship Network Operating System (NOS) is a single line of software called SR OS that spans all our routers, starting from low-end CPE routers to the core of the network. This enables us to provide a common look-and-feel, code that is well-tested against different deployment models, and common management tools. Consequently, our customers find they can manage, automate, and operate these routers seamlessly. Furthermore, they find the consistency of features allows for greater flexibility, mix-and-match, and end-to-end service deployment.
Family of silicon
to 4x the capacity,
without rebuilding a single data center!
Reusability of design plays to our strength in hardware also. With over a million routers in operation, we can’t afford to obsolete the whole platform every time we develop a new generation of network processor. The refresh cycle of hardware can be every 3 to 4 years, but the refresh cycle of deployed routers can be up to 10 years. We build our hardware expecting reuse. A line card with the FP5 on it slides into the same chassis that hosts FP4 line cards. Service providers don’t have to rethink the space, the power, the cooling, the transformers, or the power supply units. This requires a tremendous feat of engineering because each new ASIC can deliver up to 4x the capacity of a previous generation. Imagine upgrading your network to 4x the capacity, without rebuilding a single data center!
Agility in architecture
When imagining new services or introducing new functions and features, operators have traditionally handled them in massive waterfall updates. We must harness something webscalers have already mastered – agility. It is the ability of humans to pivot quickly that enables us to develop new ideas and responses to the unexpected shifts in the environment – whether it be a pandemic or just a market trend that is being sensed by our eco-systems of suppliers, partners, developers, and customers. Agility needs both architectural support and tools.
When ideas emerge, how do we actually deploy them in a network quickly? Machines are very good at being told what to do. If you get the code right, then the machine executes it correctly, with a steady, repeatable regularity that is not possible with humans. Our intent-based network management tools enable you to treat the network as a programmable entity. First, you define an intent, i.e., a model for a new service, described in a high-level modeling language. Then, you “compile” that intent into low-level instructions that are configuration statements in one or more network nodes. Finally, you supply a set of parameters that describe the “goodness” of the intent, parameters that bound the behavior of the service so that, by monitoring these parameters, we are able to measure the performance. The intent engine can be coded to even recognize anomalies and adjust behavior in the face of perturbations such as overload, failure, or maintenance events.
Another key tool in the NetOps (DevOps as applied to networking) engineer’s toolbelt is the digital twin, a metaverse that models the live network, in which new services can be tried out. The inhabitants of the metaverse are software entities that mimic the behavior of network elements, running the same code, enjoying the same connectivity, but perhaps not at the same scale of operation. Deployment of new services usually takes some trial and error, and NetOps engineers need an environment to rapidly model services.
Infrastructure as code
This approach, often called “Infrastructure as Code”, takes a page out of the webscaler’s programming style to empower the next-generation NetOps engineer, one who is skilled in writing high-level structures, deploying services rapidly and correctly the first time, running regression tests in a digital twin of the network, before unleashing those very services in a network.
Master the unexpected
Our world seems to have entered a period of cascading events, and I am convinced that the network can be one of the tools that will help us to respond to these challenges — if it is designed and prepared to master the unexpected.