NSX Firewall Migrations: Analysis

Last month, I wrote about the progressive microsegmentation model that we've been using lately for our distributed NSX firewall.  Now, I want write about how we can figure out its implementation in a brown field environment!

To do this, we heavily relied on vRealize Network Insight.  In short, vRNI creates a big index of all of the objects in your vCenter and on your network, including how they communicate together (all the way down to the specific route between any two devices!).  You can then execute queries against this giant index to pull out the data that you need.  In our case, we request all network flows for a given application, then analyze those flows to create an appropriate set of policies to apply to that application.

Our first step was to identify the application.  This customer was fairly disciplined during the VM creation process in that they generally created a VM folder for each application that was being deployed and put the appropriate VMs into that folder.  Things weren't perfectly organized with these folders, but this structure gave us a really good starting point.  After we confirmed the server list with the application owner, we created an NSX Security Tag for the application and then applied it to all of the VMs.

Next, in vRNI, we created an Application for the application (I generally like the descriptive terms that are used by vRNI, but in this one case, it makes sentence syntax a little confusing!).  We set up the membership of the Application to be VMs In Security Tag and typed the name of our Security Tag in the field.  It can take vRNI ~30 minutes to realize that VMs have been tagged, so don't worry if the Application shows as empty for a while.  Just save it and go get some coffee!

After the Security Tags have propagated into vRNI, the Application should have all of the appropriate VMs in it.  Our next step is to look at what those VMs are doing on the network, so run this query: flows where application = <application name>.  I like to adjust my timeframe to 30 days, just to ensure that I am getting the full picture of what the application is doing.

That query will generally reveal a lot of data.  Too much data for direct analysis.  In practice, I like to export that data as a CSV and then pass it to some scripts that I've written for analysis, but you can actually do those steps right in vRNI (it's just a little more manual).  The first thing that I want to do is get an idea about what kind of network traffic is coming into the application from the outside; this will help me to define my presentation groups and the policies that I'll apply to them.  To look at that traffic, I can execute this query:

flows where application = <application name> and source vm not in (vm where application = <application name>) group by port

That will show me only the traffic that enters the application from external sources, like clients or other applications.  You'll usually see a bunch of RDP and SSH in this report, so you may need to investigate further to determine if that's just admin access or if it's actually part of the application.  To do so, click on the number in the Count of Flow column, which will put together a query for you that only shows the flows on that port.  I like to add group by destination vm or group by source ip address to the end of that query to make it more readable.  The first one is great for figuring out which servers in the application are presenting traffic on that port.  The second one is great for figuring out who is using that port, which is a good way to determine if all of the SSH traffic is coming from the admin desktop subnet vs. a service that the application makes more generally available.  Similar techniques can be used on the traffic from any port, for example, to determine if HTTPS traffic comes from the entire organization or only from the inside interfaces on the F5.

So, I go through the results of that query, taking note of the ports that the application uses to present data to the outside world and which servers within the application are doing that actual presentation.  If the servers are being accessed by any servers from other applications, I take note of that as well, for those other servers are going to eventually need an Application Access policy to allow them to cross the inter-application boundary.

After I've got a clear picture of what ports my application is listening on, it's time to look at the inverse.  What systems is it reaching out to?

flows where application = <application name> and destination vm not in (vm where application = <application name>) group by destination security group

Note that I grouped this one by destination security group instead of by port.  Our security policy blocks outbound network flows to other applications, so the main purpose of this query is to see what other applications this application reaches out to.  Hopefully you've got some environmental policies in place to allow things like Active Directory, DNS, and NTP, so you can skip right past those groupings and get straight to the good stuff.  If you find that your application needs to talk to another application, you just need to make sure that you have a policy to allow that traffic!  If that other application is well understood, you're in good shape, otherwise click on that Count of Flow column again to see what flows are actually going to that application (and maybe group by port those results to make them a little more comprehensible), then define yourself a policy and you're good to go!

Comments

Popular posts from this blog

PowerShell Sorting by Multiple Columns

Clone a Standard vSwitch from one ESXi Host to Another

Deleting Orphaned (AKA Zombie) VMDK Files