Data Center

Towards Machine Learning in Networking: Benefits Begin Now

by asardell on ‎03-13-2017 07:50 AM - last edited on ‎03-13-2017 09:13 AM by Community Manager (7,420 Views)

Last Thursday, Packet Pushers featured a thought-provoking podcast on the applicability of machine learning (ML) to networking. David Meyer, Chief Scientist, VP and Fellow at Brocade discussed the challenges, opportunities and realities.

 

ML has major applicability in IT automation, turning relevant data into actionable software; in networking, ML can help ensure optimal configuration on the management and control planes. Briefly, ML achieves this by analyzing the structure of collected data to find patterns you didn’t know were there.

 

Can This Benefit DC Network Operators Today?

 

What’s important to IT and data center networking is:

 

  • Are there operational benefits to the small steps we can take from where we are today?
  • What can we do now to start advancing the cause towards vastly improved network automation? 

The answer to the first question is a definite Yes: There is already current research (as well as some concrete examples) in the use of neural networks to anticipate or counter DDoS or data exfiltration attacks.

 

To see where we might take this farther, even in the near term, consider the types of things that network engineers learn, even heuristically, and then consider whether they might be generalized and learned by an application. For instance, you might be able to identify combinations of conditions that contribute to congestion in the network:

 

  • Time of day or month
  • Business or maintenance cycle
  • User behaviors or current events 

Then you can anticipate and avert these delays before they happen again.

 

Ultimately, ML can be pivotal in the continuous correlation of network state, providing end-to-end and up-to-date models of traffic, topology and statistics for all interfaces. In the medium term (next 5 years), this can help to fulfill the proactive, self-configuration requirements of intent-based networking.

 

How Do We Get There from Here?

 

Machine learning, says Meyer, “is the about the construction and study of systems that can learn from data.” This is very different than traditional computer programming, as shown in Figure 1.

 

ML Pic.png

Figure 1: Traditional Programming and Machine Learning (Source: David Meyer--See Links Below)

 

At a high level, you are mapping collected data to known, observed behavior (outputs), and producing software that models and predicts what you can expect from this. ML systems “train” themselves: they learn how to make inferences from new knowledge as they accumulate it. 

 

Clearly, one the key drivers for successful ML is collected data from an active underlay, and networking is replete with this of course. The key is to find patterns in that data and understand it better so you can classify it in a meaningful way, and make inferences for better decisions.

 

But is Networking Data Ready to Support This?

 

Successes in ML have come in pattern recognition, natural language processing, and robotics, and working with standardized data sets has been a crucial factor in these advancements. With a labeled dataset (for instance, a carelessly scripted “4” is labelled “Four” so that related handwriting is recognized accordingly) you can embark on supervised learning.

 

This is typically more straightforward than unsupervised learning as it’s more difficult to evaluate the results. But unsupervised learning holds the greater potential for sophisticated work such as classifying unlabeled flow data.

 

And although we have all this data, most of what we collect (from SNMP, Syslogs, Netflow, and other mechanisms) is not only proprietary but very incomplete, often filled with spurious noise. These sources were not built for ML, so we’ll either need ways to standardize them or create new algorithms to deal with them (or both).

 

How Quickly will ML Advance in DC Networking?

 

In networking, we are only in the early stages of exploring ML. But for many (including Meyer) this is a great fit for transformative change in networking.  It’s a major leap beyond the slower pace of early-days protocol innovations (from hop counts to distance vectors and link states), predictable boosts in interface speeds, etc. 

 
There’s a good bit of applied math involved, including linear algebra, probability theory and statistics, multivariate calculus and other specialized algorithms. However, network engineers need conceptual understanding rather than mastery, as much is already codified in software libraries such as TensorFlow, torch, or scikit. And objects in these libraries can be invoked from event-driven workflows in various programming languages, or in StackStorm.
 

With continued momentum, Dave even says that machine learning may skip the trough of despair on the Gartner hype cycle (Figure 2).

 

Gart.jpg

Figure 2: From Peak of Expectations, ML Could Advance to Slope of Enlightenment (Source: Gartner)

 

Brocade’s DC Infrastructure in Support of ML 

 

As network automation becomes more essential in the data center, and more reliant on the mountains of data that networks have always generated, formalized ways to analyze and predict behavior are increasingly being studied and applied. Inevitably we’ll see more accurate models built from visualizations gleaned from the underlay, figuring out the right level of abstraction to provision, secure, and operate networks.

 

Visibility innovations such as SLX Insight Architecture and SLX Visibility Services are facilitating this ability in the data center, particularly in support of DevOps-style workflow automation such as Brocade Workflow Composer.
  

Machine Learning Links

 

See the following links for more information on ML:

 

 

 

 

 

 

Comments
by scet.amit
on ‎03-13-2017 01:27 PM

According to you, what are the low-hanging fruits in terms of application of ML to networking?

by asardell
‎03-13-2017 01:56 PM - edited ‎03-13-2017 02:29 PM

Hi scet.amit - 

 

Security, particularly DDoS, is an application where this is already happening. David mentioned it in the interview and there is also code posted in a link from the blog

 

I think that small (but profitable) steps can be taken by labelling collected flow/config data for better prediction (e.g., seeing if you can pin down congestion times and re-route ahead of time). There are routines in the ML libraries linked above that can help analyze this data already. It is all about the quality of what you collect and how well you can collect and use it. 

 

Thus, the lowest hanging fruit (until we get to new algorithms for unsupervised learning) is to look at the heuristics that engineers are applying today and seeing how they can be matched to the data (SNMP, counters, port mirrors, etc.) you are already collecting. 

 

Alan