VMware Integrated Containers Networking with NSX

Recently I had the chance to work on a PoC on VMware Integrated Containers (VIC).
VIC enables you to work with Docker Containers leveraging the full feature set of vSphere (HA, DRS, vMotion etc).
The logic used in VIC is to map every single Container to a micro-VM. Having a single Container in a VM provides the capability to leverage NSX Micro-Segmentation to secure Applications, and to leverage all the NSX fetures like Edge Gateways, Logical Routers, Logical Switches.
The official VIC documentation can be found at the following URL: https://vmware.github.io/vic-product/assets/file/html/1.1/§
In the official documentation you can find an excellent starting point to understand the VIC logic and mapping to Docker constructs.

In VIC, Containers are created into a Virtual Container Host (VCH), that maps to a vSphere vApp.
The VCH vApp contains an Endpoint VM that provides the Management and Networking functions.
All the Container micro-VMs are instantiated in the scope of a VCH vApp.

This is the Networking logic in VIC. As of version 1.1, each VCH can have a maximum of 3 Network Interfaces.

Public Network: The network that container VMs use to connect to the internet. Ports that containers expose with docker create -p when connected to the default bridge network are made available on the public interface of the VCH endpoint VM via network address translation (NAT), so that containers can publish network services.

Bridge Network: The network or networks that container VMs use to communicate with each other. Each VCH requires a unique bridge network. The bridge network is a port group on a distributed virtual switch.

Container Network: Container networks allow the vSphere administrator to make vSphere networks directly available to containers. This is done during deployment of a VCH by providing a mapping of the vSphere network name to an alias that is used inside the VCH endpoint VM. You can share one network alias between multiple containers.

For the scope of this PoC, I’ve designed the following Architecture:

The main goals I want to achieve are the following:

  • Leverage NSX Logical Switches for Containers Networking (this opens a scenario of easy integration between Containers and “classic” VMs leveraging the NSX Logical Router, for example to Connect an Application Server to a DB Server);
    • A Logical Switch for the Bridge Network;
    • A Logical Switch for the External (Containers) Network;
  • Leverage NSX DFW for Micro-Segmentation between Containers;
  • Leverage NSX Edge Gateway to protect Containers instantiated to an External Network and public facing. The Edge Gateway provides the following Services:
    • Firewall for North/South traffic;
    • NAT;
    • DHCP.

In my scenario, I have the Public Network accessed internally by Developers and the External Network accessed by Consumers.
The Public Network is not protected by an Edge Gateway and leverages the native Docker networking: Containers attached to the Bridge Network are NAT’d to the Public Network by the Virtual Container Host Endpoint VM.
The External Network, where Containers can be directly attached bypassing the VCH network stack, is protected by an Edge Gateway.

Before starting the installation, I’ve created the required PortGroups on my Distributed Switch, shown in the following screenshot.
You can see two “standard” PortGroups backed by VLANs, Docker-Public and External-Consumer, and two PortGroups corresponding to two NSX Logical Switches backed by VXLANs. As of version 1.1.1 there’s not yet native integration with NSX, you need to use the vSphere PortGroup name instead of the NSX native Logical Switch name to instruct VIC to use Logical Switches.

You start the installation of VIC by deploying a Virtual Appliance, provided in the OVA format.
You can see that I’ve created two Resource Pools in my Cluster, the first used for Management workloads, the second used to Host Containers. With this configuration I’m showing that Container VMs and “standard” vSphere VMs can coexist, this is a totally supported configuration. You can leverage different Resource Pools to create different Virtual Container Host. Each VCH can then be managed by different develeper teams with specific Resources, Projects, Registries, Users.

I’ve deployed the VIC vApp, in my case named VIC 1.1.1, in my Management Resource Pool because the vApp is used to manage the overall VIC installation.
The vApp is based on VMware Photon OS and provides the VIC Engine, the Container Management Portal (aka Admiral), the Registry Management Portal (aka Harbor).
From the VIC Management Appliance command line, I’ve used the vic-machine-Linux command with the create parameter to create a Linux VCH.
The command used to create a VCH accepts all the parameters to configure the Bridge, Public, Management, Client and Container Networks to be used for this specific VCH. The Bridge Network must be dedicated to this specific VCH.

After all the checks the deployment and configuration of the new VCH is automatically made base on the provided command line parameters.
I’ve chosen to attach my VCH endpoint to the Public Network with the IP address
The Bridge Interface IP address is automatically assigned and defaults to An internal IPAM manage the IP address assignment to all Containers connected to the Bridge Network, with a DHCP Scope on the network
There’s a check made during the VCH installation process regarding needed ports to be open for communication between the VCH and the ESXi Hosts in the Cluster. You can use the “vic-machine-<OS> update” command to open the required ports on all ESXi Hosts. See here for details instructions: https://vmware.github.io/vic-product/assets/files/html/1.1/vic_vsphere_admin/open_ports_on_hosts.html
The output of the installation process provides all the information you need to interact with the VCH: Admin Portal URL, published ports and the value to be set as environment variables to manage your VCH with the Docker command line.
You must run the export command with the provided information, then I the validity of the environment variables can be checked with the “docker info” command.

You can use the command “docker network ls” to list the available networks for Containers.

In the case you need to delete a specific VCH, you can use the command “vic-machine-<OS> delete” with the appropriate parameters.

After the creation of the first VCH, it can be added to the VIC Management Portal. From the Portal home page, choose “ADD A HOST“.

You need to provide the URL to reach the VCH, the Host type choosing between VCH (VIC based) and Docker (standard Docker), and the credentials to connect to the Host.

After the parameters validation, you can choose “ADD” to add the VCH to the Management Portal.

After you add the VCH to the Admiral Portal, you can see it in the Management/Hosts section. An overview of the VCH status is provided in the dashboard.

With at least one VCH created, you can start to provision Containers. Enter the Containers section in the Portal and choose “CREATE CONTAINER“.

A default Registry is available, pointing to the public Docker Hub which hosts standard Docker images available to be fetched and deployed. From the available images, my choice is to deploy a Nginx Web Server.

In the Network section of the provisioning configuration, my first scenario uses Bridge as the choice for Network Mode, configuring a binding from Port 80 (http) of the VCH Endpoint to Port 80 of the Nginx Web Server. With this configuration, developers will be able to point to the VCH Endpoint IP address to reach the Nginx Web Server. The VCH will automatically provide the needed NAT to the Container IP on the Bridge Network.

Container provisioning can be started with the “PROVISION” button.
On the right side of the Management Portal you can open the “REQUESTS” tab to look at the progress of the deployment process.

The “FINISHED” message informs you about the Container creation completed.

The newly created Container is now available in the Container section of the Portal, with connection details provided in the dashboard.

You have the capability to manage the Container with four specific buttons: Details, Stop, Remove, Scale.

Entering the details of the Container, you have all the details about CPU usage, Memory usage and the Properties.
In the Network Address row, you can see that Bridge in the choosen Network Mode and is the IP address automatically assigned to the Container VM.

Accessing the address of the VCH via http on Port 80 on the Public Network shows that the configuration is correct, the Nginx Home Page is shown as expected.

I want now to deploy a second Nginx Container, this time attached to the Container Network instead of the Bridge Network.

I could do this using the Command Line, but I prefer to follow the UI way accessing the list of available templates, choosing the official Nginx and selecting the arrow on the “PROVISION” button to access the “Enter additional info” section.

From here, I choose to save the Nginx template to create a customized version that can subsequently be automatically deployed on the Container Network.

You can edit the new template using the “Edit” button. This brings you to the “Edit Container Definition” page.

Once in the “Edit Container Definition”, you must enter the “Network” section. In this section you have the chance to add specific Networks that can be leveraged to directly attach Containers to them, bypassing the VCH network stack. You can add a new network by choosing “Add Network” in the “Network” parameter.

Here you can choose an existing network, in my case the NSX Logical Switch I named as Container Network.

I change the template name to make it unique and I save it.

Back in the “Edit Template” section you can graphically see that Containers instantiated from this template will be attached to the VIC-Container-Network NSX Logical Switch.

In the Templates view in the default Registry you can now find the customized Nginx and provision a new Container from it using the “PROVISION” button.

At the end of the provisioning process, the new Nginx Container can be found in the Containers section beside the previously deployed Nginx. The difference between the two Containers is that the first has a standard Bridge Network connection, the second is attached to an external Network, as highlighted in the following screenshot.

Looking at the vSphere Web Client, you can see three VMs deployed in the vch1 vApp (the Virtual Container Host): vch1 is the Container Endpoint, the other two VMs are the two deployed Containers with Nginx.
The highlighted Custom_nginx-mcm430… is the Container VM attached to the NSX Logical Switch VIC-Container-Network. The IP address has been assigned by the Edge Gateway providing the DHCP Service for the Container Network.

Based on the expected Architecture shown at the beginning of the article, I’ve already configured the Edge Gateway with the appropriate NAT and Firewall configuration to publish Services delivered by Containers.
The External, consumer facing interface of the Edge Gateway is configured with the IP Address and has a DNAT configured to expose the Nginx Web Server deployed on the Container Network.
Accessing correctly expose the Nginx Home Page.


Some useful commands you may need to use:
vic-machine-<OS> ls” list all the VMs deployed in the VCH, giving you the VM ID you need as input for additional commands.

An example of command that needs the Virtual Machine ID is the “vic-machine-OS debug”.
This command can be used to enable ssh access to a Container VM and to set the root password on it.


Edge Gateway Configuration.

A simple DNAT configuration. The first rule provides DNAT for the first created Custom_Nginx (the one attached to the Container Network).
The second role is pre-provisioned for the next Container I’ll deploy, with a default configuration on a different Port (8080) of the same Edge Gateway external IP address that DNAT to the next allocated internal IP address on Port 80 on the Container Network.

I’ve deployed a second Custom Nginx on the Container Network, reachable pointing to Port 8080 of the Edge Gateway as per DNAT configuration.

Containers Micro-Segmentation:

The importance of NSX Distributed Firewall “Applied To” setting

By default, Distributed Firewall (DFW) rules configured on NSX Manager are applied to all vNICs of all VMs in the vCenter inventory. The only exclusion is for VM added to a specific Exclusion List (see my previous post for this topic: NSX Distributed Firewall Exclusion List)

The DFW configuration provides a section called “Applied To” that enable the Administrator to limit the scope of applicability of a specific rule, providing a so called Point of Enforcement. You can limit the scope of a rule filtering on all objects that NSX Manager recognize for DFW rules configuration (Clusters, Logical Switches etc.)

Leveraging “Applied To” is fundamental in at least two scenarios:

  1. Avoid processing of unnecessary rules on a vNIC in an environment with hundreds or thousand of configured rules;
  2. Provide the capability to manage Virtual Machines with the same IP Address (Multi Tenancy scenario), avoiding the application of wrong rules to objects.

I want to use a specific example to show “Applied To” in action.

I have a 3-Tier Application composed by the following Virtual Machines:

  • web-01a and web-02a, attached to a Logical Switch called Web_Tier_01;
  • app-01a attached to a Logical Switch called App_Tier_01;
  • db-01a attached to a Logical Switch called DB_Tier_01.

Note: Routing and Load Balancing services have been configured appropriately to publish the application but this configuration is out of scope for the purpose of this article.

To avoid unnecessary traffic flows, I create specific Micro-Segmentation rules to grant communication between Application Tiers (and also infra-tier) only on the needed Ports and Protocols.
I want to obtain the following configuration:

  • No communication between VMs attached to the Web-Tier_01 Logical Switch;
  • Only PING and https traffic allowed from any source to the Load Balancer IP Address ( and to all VMs attached to the App-Tier-01 Logical Switch;
  • Only PING and https (port 8443, Tomcat) from all the VMs attached to the Web-Tier-01 Logical Switch to all the VMs attached to the App-Tier-01 Logical Switch;
  • Only PING and TCP Port 3306 (MySQL) from all the VMs attached to the App-Tier-01 Logical Switch to all the VMs attached to the DB-Tier-01 Logical Switch.

The configuration of the NSX DFW is the following:

By default, “Distributed Firewall” is the value configured for the “Applied To” field. This means that all the configured rules will be applied to all the vNICs of all the VMs.
If we focus on the DataBase Virtual Machine, db-01a (attached to the DB-Tier-01 Logical Switch), you can see the unneeded rules (these rules have nothing to do with the DB machine) highlighted in red and the only needed rule highlighted in green. Let’s keep note of the IDs of unneeded rules: 1006, 1007, 1008.
Note: Using Logical Switches instead of the VM name (or the legacy way, the IP Address) enables you to have a dynamic environment, all the VMs you’ll attach to a specific Logical Switch will implicitly inherit the needed rules.

I want to leverage the NSX Command Line to show the different impact you may have leaving the default “Applied To” instead of (recommended) change “Applied To” to a specific scope.
I will look at the specific vNIC DFW configuration applied at ESXi Host level.

First step, I connect to NSX Manager to leverage the centralized CLI. The first command I run is “show cluster all” to obtain the list of all clusters under the related vCenter management:

The Web, App and DB Virtual Machines are hosted on my Compute Cluster named “Compute Cluster A”. To work on objects using the CLI I need to use their IDs, in this case domain-c33 is the ID of my Compute Cluster.
I run the command “show cluster domain-c33” to obtain the list of all the Hosts that are members of this Cluster:

I decide to look at the configuration of the Host esx-01a.corp.local, running the command “show host host-28“, where host-28 is the ID of my Host:

The output of the “show host” command is the list of all VMs hosted on this specific ESXi Host, with their ID. Our focus is on the DataBase machine, db-01a with ID vm-218.
I run the command “show dfw vm vm-218” to obtain the details of all the vNIC configured in this Virtual Machine. Details are vNIC Name, vNIC ID and Filters.

Filters output shows which filter is applied to the vNIC. In this case the VMware DFW in the Slot 2 of the IOChain. In this slot all the DFW rules are stored and enforced.
I use the obtained Filter to get the list of all rules applied to the only vNIC configured on db-01a.
The command I need to run is “show dfw host host-28 filter nic-39230-eth0-vmware-sfw.2“:

From the output of the command you can see that rules with ID 1006, 1007, 1008 are enforced on the db-01a vNIC.
These rules are not needed for the DB Server and only cause overhead in the overall processing of DFW rules.

To maximize the benefits of DFW, I leverage the “Applied To” section to apply only the needed rules to each specific vNIC.
To do this, I change the “Applied To” field to the value of the relevant objects, this is the outcome:

Web Tier Micro-Segmentation rule will only be applied to the vNICs of VMs attached to the Web-Tier-01 Logical Switch.
The Rule that govern traffic flows from any source to the Web Tier will only be applied to the vNICs of VMs attached to the Web-Tier-01 Logical Switch.
Rules that govern traffic flows from the Web Tier to the App Tier will only be applied to Web-Tier-01 and App-Tier-01 Logical Switches.
Rules that govern traffic flows from the App Tier to the DB Tier will only be applied to App-Tier-01 and DB-Tier-01 Logical Switches.

To check the rules enforced at the vNIC level to db-01a, I run the command “show dfw host host-28 filter nic-39230-eth0-vmware-sfw.2“:

Not needed rules are not applied to the DB Virtual Machine vCNIC, avoiding unnecessary overhead in the Firewall rules processing.
This shows one of the most important benefits of the “Applied To” section of the NSX Distributed Firewall.

NSX Distributed Firewall Exclusion List

Quick article on an important topic: don’t lock yourself out when enabling NSX Distributed Firewall.

When you prepare vSphere Clusters for NSX and the DFW kernel module is injected into the Host’s kernel, Distributed Firewall is automatically enabled on any vNIC with a default “allow any-any” rule.

By default, some VMs are excluded from DFW and traffic can flow freely on them:

  • NSX Manager;
  • NSX Controller Cluster;
  • Edge Service Gateways.

It is recommended to manually exclude some other service VMs:

  • vCenter Server;
  • SQL Server Database used by vCenter (if you’re using the Windows version of vCenter);
  • Partners Service Virtual Machines;
  • vCenter Web Server (if installed on a different VM than vCenter).

Following, the NSX 6.3 official documentation for the Exclusion List.

It may happen to forgot to add vCenter to the exclusion list and change the defaul DFW rule to “deny any-any”.
In this case, you will no more be able to reach your vCenter and manage it using the vSphere Web Client.

To regain access to the vCenter, you can use the following API call against the NSX Manager (remember, NSX Manager is automatically excluded from DFW so you can always call APIs against it!).
You can use your favorite REST Client to perform the operation with the following parameters:
Header: “Content-Type: application/xml”
Header: “Accept: application/xml”
Authentication: “Basic”
DELETE https://nsx_manager_ip/api/4.0/firewall/globalroot-0/config
The API call should return Status Code 204.

This call erase all DFW configuration and reset the default rule to “allow any-any”.
After you regain access to your vCenter, you can load the saved (or auto-saved) firewall configuration.

In the case you don’t have a saved NSX DFW configuration (not a best practice!) and you don’t want to lose your configured rules, my colleague Angel Villar Garea has elaborated a way to recover access to vCenter without resetting the overall configuration, creating a rescue rule. you can check his article here:

UPDATE August 11th, 2017:

With NSX 6.3.3, released on August 11th 2017, the previous DELETE API call to erase the entire Firewall configuration has been deprecated.

A new method has been introduced to get the default Firewall configuration.
Use the output of this method to replace the entire configuration or any of the default sections:

  • Get default configuration with GET api/4.0/firewall/globalroot-0/defaultconfig
  • Update entire configuration with PUT /api/4.0/firewall/globalroot-0/config
  • Update single section with PUT /4.0/firewall/globalroot0/config/layer2sections|layer3sections/{sectionId}

NSX Distributed Firewall Sections and Rules via APIs

When you have to create multiple Distributed Firewall Rules, it could be very helpful to leverage NSX APIs.
It could be even more helpful to create different DFW Sections to logically separate different Tenants.

The maximum number of sections you can create under a single NSX Manager is 10,000.
The maximum number of Firewall Rules you can create per NSX Manager is 100,000.

vCloud Director 8.20, thanks to the new NSX integration, enables the configuration of Micro-Segmentation configuration on a per Tenant, per Organization Virtual DataCenter basis.
Distributed Firewall configuration is managed in the scope of every single vDC (note: DFW must be enabled on the vDC by a System Administrator).
In NSX Manager, a different Section is created for every vDC when Distributed Firewall has been enabled.

When you want to leverage the APIs to interact with the Distributed Firewall, you must obtain the Etag for the object you need to modify.
Etag stands for Entity tag and it is an identifier for a specific version of a resource. It’s used like a fingerprint to be sure that the representation of an object is not changed since the last interaction with it.

Following some examples of API interaction with DFW Sections.

  1. Firewall Section Creation
    POST https://NSX-MANAGER/api/4.0/firewall/globalroot-0/config/layer3sections –header ‘Content-Type:text/xml’Body
    <section name=”TestSection”>


    The Response Header contains Etag and Location of the new object (the DFW Section) to be used for future changes.
    Location URL will be used when we’ll modify the created Section (rename it or create/modify Firewall rules under it).

  2. Firewall Section rename (optional, it will modify the Etag)
    PUT https://NSX-MANAGER/api/4.0/firewall/globalroot-0/config/layer3sections/1006 –header ‘Content-Type:text/xml’ –header ‘if-match:”1495902005576″
    if-match makes the API call conditional, it’s only considered valid if the Etag passed in the call is equal to the one cached on the Server if the object has not changed in the meantime.Body
    <section name=”TestSectionRenamed”>



    Result: the previously created Section has been renamed to TestSectionRenamed.
    The new Etag is in the Response Header.

  3. Creation of a new Firewall rule in the created Section
    POST https://NSX-MANAGER/api/4.0/firewall/globalroot-0/config/layer3sections/1006/rules –header ‘Content-Type:text/xml’ –header ‘if-match:”1495902401161″
    Remember to always use the last generated Etag.

    <rule disabled=”false” logged=”true”>
    <sources excluded=”false”>

    In the Response Header you can find the Location of the new object (the Firewall Rule) and the new Etag to be used for future changes.

  4. Firewall Rule change
    GET https://NSX-MANAGER/api/4.0/firewall/globalroot-0/config/layer3sections/1006
    The Response Header contains the Etag to be used to modify the rule.
    If no changes happened since last change, it will be the same you can see in the screenshot at point 3).You can modify the DFW rule with the following call, using the object Location URL obtained before.
    PUT https://NSX-MANAGER/api/4.0/firewall/globalroot-0/config/layer3sections/1006/rules/1007 –header ‘Content-Type:text/xml’ –header ‘if-match:” 1495903193596″

    <rule disabled=”false” logged=”true”>


    <sources excluded=”false”>

    Here’s the modified rule:

    The new Etag to be used is in the Response Header:

    If you try to modify the FW rule using the wrong Etag, you get the following error:

You can find the full NSX API documentation at the following link: