Why WAN Metrics are not enough in SD-WAN Policy Enforcement

On the topic of measuring WAN metrics, most engineers think to look at the standard statistics of loss, latency, jitter, and reachability for determining path quality. This is good information for a routing protocol that is making decisions for packet flow at layer 3 of the OSI model. However, it is incomplete information when looking at it from the perspective of the overall user experience.  In order for an SD-WAN solution to provide materially better value than a typical packet router, it must look beyond the metrics considered by the router.

Aren’t SD-WAN Devices Essentially Routers?

SD-WAN devices shouldn’t be considered routers in the conventional sense. Routers use local tables and algorithms such as Dijkstra to determine the shortest path to a destination for a packet. The term packet is important here. It is all that the router cares about. If you look up the definition of a router, it is a device that functions at layer 3 to deliver packets to their destination network. When there is a problem the router will process the topology change and compute new routing table entries that are a point in time decision of the available paths. These topology changes take time to process. This can cause packet loss, latency, and jitter for anything traversing that segment. Hence why we use those measurements to determine the health of our legacy networks.

SD-WAN functions on a completely different paradigm. Rather than distributed logic and archaic protocols that propagate topology changes across a network, it utilizes centralized logic that looks at the network as a whole with a distributed forwarding plane that makes real-time decisions based on quality metrics. SD-WAN creates an overlay network that, while understanding the topology, abstracts the configuration of delivering connectivity between locations and endpoints from the lower level WAN routing protocols. In essence, SD-WAN delivers application flows from a source to a destination based on the configured policy and best available network path. This is the central thesis to how SD-WAN technologies should work.

Let’s say that you have 2 circuits going from a remote office back to your data center. One of the circuits is 10 Mbps MPLS link and the other is a business Internet link with 50Mbps download speed and 20Mbps upload speed. You have critical traffic being sent via the MPLS link.  Your typical metrics for the WAN say that the best path is the MPLS circuit with 1% packet loss, 65msec latency, and 2ms jitter. The internet has 1% packet loss, 80msec latency, and 2ms jitter. You would say that the MPLS is functioning better. Remember these measurements are performed on a 1 minute or 5 minute cycle on most routers. Guess what – you’re not getting all the information.

The SD-WAN devices are constantly monitoring, in real time, the application flows going across your network. It is also making adjustments in real time to compensate for issues encountered not only within the WAN links, but also within the applications based on how they perform while traversing those links.  It can see both the MPLS and the Internet circuit and understands how each application performs over those links. It knows when there is an intermittent problem on the MPLS circuit that will interfere with your VoIP traffic or degrade performance of your SaaS app by delaying transaction response, because it has measured those effects. It can choose, for example to re-route application flows based on those metrics or it could balance flows across both paths simultaneously and reassemble it at the destination.

SD-WAN captures metrics that go far beyond the typical WAN measurements including application response time, network transfer time, and server response time. It sees how the flows are behaving along the entire path. It understands if a path is suffering from congestion, or the presence of buffering or congested queues somewhere on the link. It understands how application transaction response time varies from one path to another.  This is the only way that applications can be delivered efficiently and reliably while being policy driven, and allows administrators to focus on business objectives like performance, security, and compliance, rather than underlying protocols. Policy driven application networking (what I think SD-WAN should have been called) requires decisive action based upon factors well outside of those required to simply route packets. In order to provide this decisive action, the typical “follow the packet” mentality has to be forgotten. It is more aptly put “follow the user experience”.

Leave a Reply