NSX Section Based Distributed Firewall Model

I've written before about creating NSX Distributed Firewall Rules following a model that uses rules that will specifically hit traffic based on if it's Inbound or Outbound.  That model is also useful for creating NSX Security Policies, as there's no negative logic (NOT applied to object) in the rule set.  While that model works great, it can be a bit difficult to wrap your head around.  In turn, that can make it difficult to hand off to a customer... so we've been working on an alternate model.

Unfortunately, this model does not work with Service Composer Policies, but it's flexible enough that it doesn't really need them.  It's based on a set of generic Security Tags (with corresponding Security Groups), that interact to create a dynamic micro-segmentation solution.  This model is based on defining a set of DFW Sections, each of which serves a very specific purpose in blocking or allowing traffic.  When creating new firewall rules, the administrator only needs to know which section would be appropriate and can otherwise write it as a very simple source/destination/service rule (with no worry about direction).

At its simplest, this model has three Deny sections and three Allow sections, paired so that an Allow section immediately precedes its equivalent Deny section.  The pairs of sections correspond to Environment, Solution, and Tier, although this model is flexible enough to add or remove pairs arbitrarily (or potentially change out sections, such as having Tenants in addition to or instead of Environment).  That means that, at a high level, the firewall looks like this:
  1. Environment Allow Rules
  2. Environment Deny Rules
  3. Solution Allow Rules
  4. Solution Deny Rules
  5. Tier Allow Rules
  6. Tier Deny Rules
In this example, the Environment tags will be Production, Testing, Development, and Shared Infrastructure.  There will be one Solution tag for every solution in an environment (such as Exchange, Sharepoint, SAP, etc.), plus an Infrastructure tag for any infrastructure that is common to all solutions but is does not cross between Environments.  The Tier tags will be Database, Application, Web, and DMZ.  By creating a relatively simple set of Firewall rules based on these tags, we can create a very complex and dynamic set of behaviors.  Each VM would receive one or more tags from each pair of sections.  So, a single server might be tagged as Production, Sharepoint, and Database.  Another server might be Development, SAP, Web, and Application.  You get the idea.  Let's look at the rules:

Section: Environment Allow
Rule # Source Destination Action
1 Any Shared Infrastructure Allow
Section: Environment Deny
Rule # Source Destination Action
2 NOT Production Production Deny
3 NOT Testing Testing Deny
4 NOT Development Development Deny
Section: Solution Allow
Rule # Source Destination Action
5 Any Infrastructure Allow
Section: Solution Deny
Rule # Source Destination Action
6 NOT Sharepoint Sharepoint Deny
7 NOT Exchange Exchange Deny
8 NOT SAP SAP Deny
Section: Tier Allow
Rule # Source Destination Action
9 DMZ Web Allow
10 Web Application Allow
11 Application Database Allow
Section: Tier Deny
Rule # Source Destination Action
12 NOT Web Web Deny
13 NOT Application Application Deny
14 NOT Database Database Deny
Section: Default
Rule # Source Destination Action
END Any Any Allow
This model works by paring away undesirable traffic until only known-good traffic is allowed in the Tier-Allow section (and then blocking what's left at the end).  There is room for exceptions though (that's why there's the Allow half for each section pair), as every environment has some specific flows that need to cross the gap between solutions or even environments.  So, what's this rule set doing?

First, it checks to see if there is a specific allow for a given flow that's going to basic infrastructure services.  This is where DNS, DHCP, NTP, etc. would live and you could certainly create specific groups for each service and create Allow rules that only allow that specific service instead of all traffic (read: I didn't get that specific here to make this easier to read).  If the traffic is allowed, then it passes through the firewall and rule processing is completed, otherwise it continues processing.

Next, it checks if the traffic is going between two environments.  If it is crossing the border between Production, Testing, or Development, it is denied.  Otherwise, it continues processing rules.  After this point, we know that all remaining traffic originates from the same environment as its destination.

Next, it checks for inter-solution Allow rules.  If you have servers that need to communicate between solutions, those allow rules would belong here.  Thus, your Sharepoint and SAP systems could integrate by allowing a system from one to communicate with a system in the other.  Remember that we've already filtered out any inter-environment traffic, so there's no threat of Development systems messing with Production servers.  Traffic that is specifically allowed to cross the barriers between solutions would be allowed here, otherwise it continues to process the rules.

Next it looks for inter-solution Deny rules.  If the traffic is crossing the border between two solutions (and hasn't been allowed above), that traffic will be denied at this point, otherwise it continues processing rules.  After this stage, we know that all remaining traffic must have originated from the same Environment and the same Solution as its destination.

Now it checks for inter-tier Allow rules.  We have a set of default Allow rules defined, enabling DMZ addresses to talk to the Web tier (since who puts Web servers directly in the DMZ any more?), Web to talk to App, and App to talk to DB.  As long as the VMs are tagged correctly, this will allow all of the standard network flows that make up a multi-tier application.  Because a given VM may be tagged as multiple tiers, this model easily accommodates 2-tier solutions by tagging one of the VMs with the two tiers that are collapsed on it.  If the traffic is allowed, it flows, otherwise rule processing continues.

The last section is the inter-tier Deny rules.  There are two manifestations of this tier in the model, depending on how strict you want your microsegementation to be.  At its strictest, this tier would be an Any-Any Deny rule (not shown above).  That level of migrosegmentation would obviously block all traffic that gets to it, including Environment+Solution+Tier adjacent traffic (so the Prod Sharepoint Application server could not talk to the Prod Sharepoint Application server, unless an exception was made).  While this is certainly appropriate for some environments, it can be more locked down than most environments need and it forces administrators to create a lot of exceptions.  Using the version illustrated above would not block that Environment+Solution+Tier adjacent traffic, thus allowing servers of the same tier of the same solution to talk freely.

That's not super simple... so why do I like this model?  One word: manageability.  When creating additional rules, the administrator doesn't need to consider the firewall rule list at all, they just need to consider the nature of the rule that they're creating.  Is this an exceptional flow that needs to go between solutions?  Put it in the Solution-Allow section.  Is this a flow between Environments?  Put it in the Environment-Allow section.  The order of the rules within each section doesn't matter, so as long as the administrator puts the new rule into the appropriate section, the firewall will behave as expected.

In addition, new rules will only need to be created under two circumstances: a new Environment, Solution, or Tier is defined (most often, that will be a new Solution), or a new exception is required.  The process of defining the new Environment/Solution/Tier is simple: create a new Security Group, a new Security Tag, and a new Not-X to X Deny Firewall Rule in the appropriate section.  As discussed above, exceptional rules are also easy to place, due to the clearly defined sections.  That said, very few new firewall rules will need to be created in this model, as most VM communication will be allowed or denied by simply tagging the VMs appropriately.

Comments

Popular posts from this blog

Clone a Standard vSwitch from one ESXi Host to Another

PowerShell Sorting by Multiple Columns

Deleting Orphaned (AKA Zombie) VMDK Files