My first day at Tech Field Days Extra @ Cisco Live 2018

I’ve been thinking about what to write on my first day of attendance at TFDx. I had so much going on in my tiny and inadequate brain that it’s really hard to put things on (digital) paper.

I’ll start saying that I’m thankful to the staff at GestaltIT for having had the opportunity of attending as a TFD delegate and meet the “best of breed” Network professionals.

The crew and I are having great fun talking about all sorts of weird (networking) stuff and always challenge each other with very nice problems to be solved.

This is what I was expecting and I’m very pleased about this.

Now, let me get onto the technical stuff. This event is sponsored and held at #CLEUR2018, so don’t expect me to talk anything but Cisco 🙂

Uh, before I go on, a disclaimer: this is a personal weblog. The opinions expressed here represent my own and may not represent those of my employer.

Multicloud strategy

12

There was a lot going on about hybrid-cloud (rebranded as multi-cloud), with particular attention about how Cisco is trying to magically make the deployment of applications across different public-cloud providers an easy job, but especially a lot of focus on how to create a network overlay to connect the private cloud instance(s) to VPCs in the cloud using CSR1000Vs.

Now, I get the fact that having an IOS-XE router to play with is resuscitating our inner networking soul, but my gut feeling is that this is not a great idea, first and foremost because the basic problem it’s trying to solve (which is creating a networking overlay to connect workloads in multiple different environments – sounds like SD-WAN to me btw) can be already fixed with whatever tunneling technologies the Cloud Providers already provide.
I understand that with the 1000Vs you can get a lot of nice stuff and features on top of the IPSEC tunneling but still…would I pay for that? No.

Not to mention how inefficient that would be from a data-plane perspective where traffic would need to be hairpinned in and out of the 1000V VMs multiple times before it’s actually sent out. (see picture below)

Not efficient and not very cloud-native either.

VPC HA

Then we had a good session on umbrella + AMP for Endpoints, but honestly I’m not a security expert and my friend and TFDx peer Jasper (https://blog.packet-foo.com) will have more on that I am sure!

What caught my interest the most was two presentations at the end of the day on the Cisco Network Assurance Engine and Tetration.

Network assurance Engine

This is good. If I had to define it in a naïve way, this is Forward Networks or Veriflow with a different logo. (all based on formal-verification mathematical foundations)
The NAE collects state data from the network (any control-plane and forwarding-plane data-structure such as ARP Tables, MAC-address-tables, RIBs, FIBs, LFIBs, etc..) and builds a binary-decision tree model.

Now, under the assumption that whatever action a device takes when handling a packet is a deterministic transformation, the NAE can predict how packets flow into a complex system through the network (what Forward Networks does the same way – I understand).

This presentation was heavily washed with the word “intent“. The only intent I saw there was the “definition” of conditions that the tool verification mechanism needs to periodically check every time the network model is re-built. (similar to Forward).

Nice thing is that the tool give you some suggested steps on how to fix the issue (whether those make real sense I can’t really say, but if that works, that’s cool). Also cool is the capability to roll back in time to understand what happened at a certain point in time to a particular flow/device so that you can try and understand what went wrong and prevent it from happening in the future.

Also very useful their arc-styled diagram that shows visually “who can talk to who” (similar to the one you find in Tetration – see below), which helps identifying whether or not there are some non-intended paths across tenants who are not supposed to talk to one another.

Tetration

I liked this a lot. Very non-Cisco-style. Presentation led by two very entertaining and captivating geeks (Tim Garner and Remi Philippe).

Tetration is a very powerful analytics tool that does a few things. #netflowreloaded
Tetration relies (sadly) on the use of proprietary (Cisco) ASICs to perform per-packet analytics on silicon and stream them every 100ms to a collector where the intelligence resides. (They also collect some more basic analytics directly on the hosts/tenants using a binary that does all the magic.)
They stream the data in a clever way as they don’t send metadata for each and every packet that traverses the silicon, but just the deltas between each packet and the following (reminds me of the old days when I was studying electrical communications 🙂

But, why doing that per-packet? They claim (and I sorta agree) that sometimes flows are so bursty that often times the sample rate is not high enough to capture transient issues.

Despite this intelligence, you can figure out pretty easily about one potential issue. The volume of data that those guys need to process and keep (the tool keeps also a history of the detailed-per-flow-per-packet analytics) is humongous. It’s real BIG data applied to network analytics. Fair enough. You can’t have good analysis with enough data. So, off we go.

(video from NFD16)

What I liked most about Tetration is their take on micro-segmentation. Their “zero-trust-model” allows for security policies to be tailored around the application and are applied on both ends of the communication.
The concept of workspace (the concept is not new, the name is, and this is very Cisco Style), adds another level of granularity where resources and applications are grouped by owner.
Owners can decide whether or not to “expose” a particular L4 endpoint to consumers outside its own workspace.
Perspective consumers then “request” access to that service via the Tetration GUI (or API). If the request is accepted Tetration goes on and applies the correct ACLs to the ACI fabric to make sure the communication does actually happen.

conversations.png

Now, you immediately realise what in my view is the weak point of Tetration. ACI.
I have the feeling that all the BUs in Cisco are officially asked to cling onto ACI, and desperately try to keep it alive.

My advice would be: try not to tie your future to ACI. This product can go very far, I think.

The second issue I think is that if they don’t find a way of doing this collection either on commodity ASICs (I asked about Barefoot/Cavium, and I got a “no comment” response which was quite encouraging) their market slice would be pretty limited in size. Also doing that right on the hosts leveraging the cores on SmartNICs would not be a bad idea.

Conclusions

The first day was really dense with content (and buzzwords) but I hope I captured some interesting points.

Overall, I kept reading and listening to this claim by Cisco that “there’s too many tools out there”, and that the landscape of Network OSS and assurance tools is too complicated.

Well, what can I say?  In the short space of two days I heard about 3-4 different products (Tetration/NAE/DNA) whose features could be perfectly part of the same platform as they do quite similar things (albeit based on very different information sources).

So, bi-polar disorder or is it just me?

Finally. #rantalert
They sold us for years the story about the fact that with SDN we finally defy the lock-in (I wrote about this  in Vendor lock-in effect and the SDN hype), but what I keep hearing is: “this works just with ACI”, “this requires the last generation of N9Ks”, “you need to use HyperFlex to do this”, and so on and so forth…
This sounds as lock-in 2.0 (perhaps even worse than its predecessor).

What do you think?

3 thoughts on “My first day at Tech Field Days Extra @ Cisco Live 2018

Add yours

  1. Just a couple of comments (since you asked :))

    With regard to Tetration: You can run Tetration without a single Cisco switch or router. Most of the good info comes from the end points, not from the hardware sensors. The hardware sensors can provide some good info but I’d rather have host info than (which includes things like process metadata, etc.) than just hardware telemetry alone. The software sensors can also configure the hosts native firewalls (IPtables/Windows Firewall) for centralized enforcement/configuration. That can’t be done with the hardware sensors (though there is ACI integration). The slight drawback there is it only works with Windows and Linux hosts. AIX and a few others get some more basic telemetry and no enforcement.

    With regard to ACI: Pretty much all of the SDN solutions have some boundary of proprietary to them. ACI, NSX, etc. Even with the open standards (eVPN/VXLAN), the vendors may not play so well, so effectively we’ve got single-vendor fabrics. Whether that’s hardware fabrics (ACI) or software fabrics (NSX). We’ve had that all the way back to Fibre Channel. It’s not great, but that’s what we have given the engineering/interopability challenges of building these types of fabrics. What I think is important is the north and south bound connectivity. APIs and other protocols and encapsulations (VLAN, IETF VXLAN) to connect the data plane, control plane, and policy planes to other parts of the DC/network are where it’s really mattering at this point.

    I bring those two up (ACI/Tetration) because I teach them on a regular basis. They both have their benefits and drawbacks, but I thought I’d throw that out there.

    Like

  2. Great comments Tony! Thanks!
    Regarding Tetration, if you’re not doing packet by packet capture on the ASIC you loose most of the magic sauce (IMHO), which is to be able to have hop-by-hop analytics and really understand where problems happen.
    Regarding the fabrics you mentioned, yes, I agree that those are the most common on the market, but certainly not the only ones. As you suggest, most of the SDN controllers play nicer with the hardware boxes of the same vendor 🙂
    Finally, regarding EVPN-based fabrics I partly disagree with you. It’s true that some platforms may not play nicely with it, but this would mean that the implementation is poor. EVPN is a standard. I’ve heard about BGP interop issues just a few times in my career and – despite the fact that EVPN is relatively young – it is a standard nevertheless.

    Like

    1. I think in the beginning, Cisco was touting the ASIC thing a bit too much. From an application perspective, (most of what Tetration benefits) its the software sensors that do a lot of the heavy lifting. One of the big pushes for Tetration is to get to a zero-trust model within an application. Orgs have close to zero idea how apps intra-communicate, so they let everything through. The hardware sensors can do some of intra-app mapping, but the software sensors build a more complete picture in that regard. The software sensors tell you what end points talk to what, build application profiles, and do enforcement. They also tell you how much of latency in communication is from the OS network stack, application stack, and physical network. The hardware sensors can tell a bit of that, but not all off it, though it can do some hop-by-hop analysis (combined they give lots of good info). The hardware sensors can do some intra-network troubleshooting and congestion spotting, but certainly not enforcement, mapping apps to processes, etc. From a network perspective, yeah the ASICs are good, but from an apps standpoint (most of what companies buy Tetration for I think), they’re more of a nice-to-have.
      The eVPN thing is pretty new, and I don’t know of any mixed fabrics in production, though that could be from my enterprise-centric view. Perhaps in some cases leafs being one vendor and spines being another vendor, but from a failure domain most organizations tend to go with building a single-vendor fabric. There’s a lot of knobs/etc. I think you run into the same problem as OpenStack: Open, sure, but all the components pieces need to fit together right and getting them to do so in practice is a lot more difficult than a PNG diagram would seem to imply. Perhaps that will change in the future as the standards become better implemented. In the enterprise, that’s partly why ACI and NSX are the two most common examples I think: They’re mostly plug in play, and have standards-based ways to talk outside the fabric. Operationally, I think right now eVPN fabrics are where OpenStack is: You need a lot of operational specialty to make it work, and such it’s not seen much in the enterprise.

      Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: