SMB/NFS Services for Workloads in VMware Cloud on AWS
In this article I’d like to show you how the native integration between AWS Services and VMware Cloud on AWS can provide you a lot of powerful capabilities. File Services for VMware Cloud on AWS workloads is one of the most common use cases.
We can leverage AWS Storage Gateway – File Gateway to provide our workloads with SMB or NFS shares. This enables us to re-think our approach to the Cloud when it comes to migrate File Servers, store backups or leveraging services such as Athena to analyze our data once stored in S3.
The most common use cases for the File Gateway are: Online Content Repository, Backup to Cloud, Big Data-Machine Learning-Data Processing leveraging files stored in S3, Vertical Applications which creates lot of long-term-retention files.
Architecture and Service Description
In the following picture you can see the Architecture of the solution we are about to implement.
AWS Storage Gateway is a Virtual Appliance that exposes File Services to VMware workloads. It has been historically deployed on-premises, but now that we have VMware Cloud on AWS, we can take advantage of the high speed and low latency connection provided by the ENI that connects our SDDC with all the native AWS services in the Connected VPC.
The Storage Gateway Appliance will expose SMB and/or NFS shares to our workloads hosted in VMC. Frequently accessed files will be cached locally by the appliance while all other files will be stored in Amazon S3.
Simply put, we can now leverage S3 as our File Server, with the Storage Gateway Appliance exposing S3 objects in the form of files, through SMB or NFS shares, to VMware Cloud on AWS Workloads.
There’s a 1:1 mapping between a file in the share and the related object in S3, and the folder structure is preserved.
With our files in S3, we can also leverage S3 versioning, lifecycle policies and cross-region replication. We can think of a File Gateway as a file system mount on S3.
In addition, as we have the “magic” ENI-based routing connection between our SDDC and the Connected VPC in place, we don’t need to configure a Proxy Server to be able to access S3 from the Storage Gateway deployed in VMC. Routing works out of the box and is automatically managed for us between VMC and the Connected VPC.
This is the High Level Architecture of the solution we are implementing:
From a performance perspective, AWS recommends the following: https://docs.aws.amazon.com/storagegateway/latest/userguide/Performance.html#performance-fgw
From an high availability perspective, we can leverage vSphere HA to provide high availability for our File Gateway. You can read more here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Performance.html#vmware-ha
We’ll test vSphere HA with File Gateway later, during the deployment wizard.
Get the VPC Subnet and Availability Zone where the SDDC is deployed
We need to accomplish some preliminary steps to gather some information about our SDDC, that we’ll need later. In addition, we need to configure some Firewall Rules to enable communication between our SDDC and the Connected VPC where we’ll configure our Gateway Endpoint.
As a first step, we need to access our VMware Cloud Services console and access VMware Cloud on AWS.
The second step is to access our SDDC clicking on “View Details”. Alternatively, you can click on the SDDC name.
Once in our SDDC, we need to select the “Networking & Security” tab.
In the “Networking & Security” tab, we must head to the “Connected VPC” section, where we can find the VCP subnet and AZ that we did choose upon deployment of the SDDC. Our SDDC resides there, therefore every AWS service we will configure in this same AZ will not cause us any traffic charge. We need to keep note of the VPC subnet and AZ as we’ll need this information later.
Create SDDC Firewall Rules
The second preliminary step we need to perform is to enable bi-directional communication between our SDDC and the Connected VPC through the Compute Gateway (CGW). I’ll not go through the details of the Firewall Rules creation in this post, but simply highlight the result: for the sake of simplicity, in this example we have a rule allowing any kind of traffic from the Connected VPC Prefixes and S3 Prefixes to any destination, and vice-versa. As you can see, both rules are applied to the VPC Interface which actually is the cross-Account ENI connecting the SDDC to the Connected VPC.
If we would like to configure more granular security, we could do this leveraging the information highlighted in the AWS documentation here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Resource_Ports.html
Let’s now have a look at the actual implementation of the File Gateway in VMC and how it works.
Create the Storage Gateway VPC Endpoint
First, we need to access the AWS Management Console for the AWS Account linked to the VMware Cloud on AWS SDDC and select “Storage Gateway” from the AWS Services (hint: start typing in the “Find Services” field and the relevant services will be filtered for you). Make sure you are connecting to the right Region where your SDDC and Connected VPC are deployed.
If you don’t have any Storage Gateway already deployed, You will be presented with the Get Started page. Click on “Get Started” to create your Storage Gateway. (hint: if you already have one or more Storage Gateways deployed, simply click on “Create Gateway” in the landing page for the service).
You will be presented with the Create Gateway wizard. The first step is to choose the Gateway type. In this scenario, we are focusing on File Services and we will select “File Gateway”. Click “Next”.
The second step is to download the OVA image to be installed on our vSphere Environment in VMC. Click on “Download Image”, then click “Next”.
Deploy the Storage Gateway Virtual Appliance in VMware Cloud on AWS
Now that we have download the ESXi image, we’ll momentarily leave the AWS Console and move to our vSphere Client, to install the Storage Gateway Virtual Appliance. I’m assuming here that we have the VMware Cloud on AWS SDDC already deployed and we have access to our vCenter in the Cloud. SDDC deployment is covered in detail in one of my previous posts here:https://www.esvr.cloud/2018/08/10/vmware-cloud-on-aws-lets-create-our-first-vmware-sddc-on-aws/
Head to the Inventory Object where you want to deploy the Virtual Appliance (e.g. Compute-ResourcePool), right click and select “Deploy OVF Template…”
Select the previously downloaded Virtual Appliance. This is named “aws-storage-gateway-latest.ova” at the time of this writing. Click “Next”.
Provide a name for the new Virtual Machine, then click “Next”.
Confirm the Compute Resource where you want to deploy the Virtual Appliance (e.g. Compute-ResourcePool). Then, click “Next”.
In the “Review details” page, click “Next”.
Select the Storage that will host our Virtual Appliance. In VMware Cloud on AWS this will be “WorkloadDatastore”. Click “Next”.
Select the destination network for the Virtual Appliance and click “Next”.
In the “Ready to Complete” window, click “Finish” to start the creation of the Storage Gateway Virtual Appliance.
We now have our Storage Gateway Appliance in the SDDC’s vCenter inventory. Let’s edit the VM to add some storage to be used for caching. To clarify, in addition to the 80 GB base VMDK, the Storage Gateway Appliance must have at least one additional VMDK of at least 150 GB in size. You can see all the Storage Gateway requirements here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Requirements.html
Select the Storage Gateway VM, select “ACTIONS” then “Edit Settings…”.
In the “Edit Settings…” window, under Virtual Hardware, add a new disk device by clicking on “ADD NEW DEVICE” and selecting “Hard Disk”.
Select a size of at least 150 GB for the new disk. Then click “OK”.
Create VPC Endpoint for Storage Gateway
We can now switch back to the AWS Console, where we should be in the “Service Endpoint” page of the File Gateway deployment wizard. In case we’re still in the “Select Platform” window, we can simply click “Next”. As we want to have a private, direct connection between the Storage Gateway vApp and the Storage Gateway Endpoint, we will select “VPC” as our Endpoint Type. Click on the “Create a VPC endpoint” button to open a new window where we can create our endpoint.
A VPC Endpoint is a direct private connection from a VPC to a native AWS Service. With a VPC Endpoint in place, we don’t need an Internet Gateway, NAT Gateway or VPN to access AWS Services from inside our VPC, and instances in the VPC do not require public IP addresses.
A VPC Endpoint for Storage Gateway is based on the PrivateLink networking feature and it is an Interface-based (ENI) Endpoint.
In the “Create Endpoint” wizard, we have a couple of choices we must make for our Storage Gateway Endpoint: Service category will be “AWS Services”, then we’ll select the same AZ and subnet where our SDDC is deployed (note: we could select more than one AZ and subnet for better resilience of the endpoint, but we would potentially incur in cross-AZ charges and it could make no sense to have cross-AZ resiliency of the File Gateway, unless we also deploy our SDDC in a Stretched Cluster configuration between two AZs). Lastly, we can leave the default security group selected and click on “Create endpoint”.
Once the deployment is finished, we’ll be able to see our VPC Endpoint available in the AWS Console. You can see here that the Endpoint type is “Interface”.
We can now switch back to the File Gateway creation wizard, but before that we must take note of the IP address assigned to our Storage Endpoint. We could use either the DNS name or the IP address to configure our File Gateway, I’m choosing to use the IP address in this example, let’s see where we can find the IP address assigned to the ENI (Storage Endpoint). This is visible in the “Subnets” tab, where one ENI is created for each Subnet the VPC Endpoint is attached to.
We can now input the IP address of our VPC Endpoint in the Storage Gateway creation wizard. Then, click “Next”.
This brings us to the “Connect to Gateway” window. Here, we can input the IP address assigned to the Storage Gateway VM deployed in VMC. Then, click on “Connect to gateway”.
The next step in the wizard is to activate our Gateway. We can review the pre-compiled fields and optionally assign a Tag to our Gateway. When done, click on “Activate Gateway”.
We’ll get a confirmation message that our Storage (File) Gateway is now active. Additionally, we are presented with the local disk configuration window. In this window we must ensure that one or more disks are allocated to cache to most frequently accessed files locally on the File Gateway itself. When done, click on “Configure logging”.
In this example we are not configuring Cloudwatch logging for this File Gateway, for this reason we can leave the default of “Disable Logging”. We can now click on “Verify VMware HA” to verify that our File Gateway can be correctly protected by VMware HA. In VMC we have both VM level and Host level protection, and all the settings are already pre-configured based on best practices. In VMC, vSphere HA is perfectly configured out-of-the-box to provide high availability to our File Gateway. Let’s click on “Verify VMware HA” to actually see this in action.
We are now getting a message asking us to confirm that we want to test VMware HA and also providing us with a reminder that this step is only needed if the File Gateway is deployed on a VMware HA enabled Cluster. Click on “Verify VMware HA”.
This starts the HA test, simulating a failure inside the File Gateway VM causing it to be restarted by VMware HA. We are immediately notified that the test is in progress.
When the test completes, we are notified that it has completed successfully. We can now click on “Save and continue” to close the wizard.
This brings us back to the AWS Console where we can see that our File Gateway has been successfully created.
We need to take an additional step before we can actually create our first file share. Until now, we have created a Storage Gateway Endpoint and connected a File Gateway VM to it. To make the Storage Gateway Endpoint capable to route towards S3, we also have to create an S3 Endpoint in the Connected VPC. First step to create an S3 Endpoint is to move back to the AWS Console, Select VPC as the Service, and finally choose “Endpoints”.
Once in the VPC Endpoints windows, we can click on “Create Endpoint”.
We can now set our Endpoint parameters: select com.amazonaws.<region_code>.s3 as the Service Name, e.g. com.amazonaws.eu-central-1.s3 (hint: filtering on “S3”, it will be the only option you get). Note how the S3 Endpoint is a “Gateway” Endpoint, for this reason it’s based on route table entries.
We can accept the proposed VPC and the main route table (these are the default options). We can then click on “Create Endpoint”.
The result of this configuration is that we now have a new entry in the default routing table of the VPC, with destination the S3 prefixes list (pl-xxxxxxx) targeted to the S3 VPC Endpoint (vpce-id).
Create S3 VPC Endpoint
We could optionally manage access to the Endpoint attaching a VPC Endpoint Policy to it. We’ll leave the default option in this example.
Once we have created the S3 VPC Endpoint, still in the AWS Console we can move back to the Storage Gateway Service window. Here, we can finally create our first file share. Let’s first add the File Gateway to our Active Directory, this way we can leverage AD authentication and authorization using ACLs.
To continue and add our File Gateway to our Active Directory, select “Edit SMB settings” from the “Actions” menu.
Here, we’ll add the required fields about our own Active Directory Domain and then click on “Save”. Note how the Active Directory status in this phase is still “Detached”.
A “Join domain request sent” message will be shown and the Active Directory status will change to “Join In Progress”.
Once the File Gateway has been joined to Active Directory, the wizard will show us a “Successfully joined domain” message, and the Active Directory status will change to “Joined”.
We can now click on the “Create file share” button to create our first share.
In the “Configure file share settings” window, we must input the name of the S3 bucket that will host our files (note: the bucket must be already in place. Creation of the S3 bucket is out of scope for this post). Click “Next” when done.
In the next window we can select the S3 storage class we want to use. We can safely leave the default “S3 Standard” selected for our scenario. Additionally, we can choose if we want to create a new IAM role or use an existing one, to access our S3 bucket. When done, click “Next”.
In the following window, we can change SMB share settings such as authentication method, read/write access and access controls. We can accept the defaults and have Active Directory authentication, read and write access and access control managed by ACLs.
By default, all Active Directory authenticated users are granted read and write access to our share. We can edit this setting to set a more granular access control based on single users or group membership, if needed. Click “Create file share” when done.
Here we are. Our SMB share is ready to be used. We can now provide File Services to our workloads hosted in VMware Cloud on AWS. Let’s take note of the command line proposed in the low-end side if the window as we’ll use it momentarily to map the share to a drive letter in a Windows Virtual Machine.
Let’s move to a Windows VM hosted in VMC. In my case, the VM is already joined to my Active Directory domain and I’m connected using my Domain User Account credentials.
I’m using the previously copied command and pasting it in a command prompt, this will map our share to a drive letter in the Windows machine.
The command completed successfully and we now have our SMB share, delivered by the File Gateway and backed by S3.
We can access our P: drive from the Windows File Explorer and upload a file.
We can double check the File Gateway functionality by accessing the S3 bucket that is backing our share and checking if our uploaded file is there. And there is, as expected.
Configure vSAN Policy for the File Gateway VM
The last step we should make to follow AWS best practices is to reserve all disk space for the File Gateway cache disk(s). AWS recommends to create cache disks with Thick Provisioned format, but as we are leveraging vSAN in VMC we don’t have Thick Provisioning available in the traditional sense. We must use Storage Policies to reserve all disk space for the cache disk. The first step is to go into our vSphere Client and select “Policies and Profiles” from the main Menu.
In the “Policies and Profiles” page, under “VM Storage Policies”, select “Create VM Storage Policy”.
In the “Create VM Storage Policy”, select a name for the policy and click “Next”.
In the “Policy Structure” window, set the flag on “Enable rules for vSAN storage”, then click “Next”.
In the vSAN window, under “Availability” configuration, we can leave the default settings and switch to the “Advanced Policy Rules” tab.
Once in the “Advanced Policy Rules” tab, we can change the “Object space reservation” field to “Thick provisioning”, leaving all the other fields at their defaults. Then, click “Next”.
Select the “WorkloadDatastore” and click “Next”.
In the next window we can review all the settings we have made and click “Finish”.
We can now move to our File Gateway Virtual Machine and select “Edit Settings…” under the “ACTIONS” Menu.
Under the “Virtual Hardware” tab, we can now select the hard disk we assigned to the File Gateway as the cache volume, and assign the newly created Storage Policy to it. Once done, click “OK”. This will pre-assign all the configured disk space to that disk, replacing the default thin provisioning policy.
This concludes this post.
We have created a File Gateway in VMware Cloud on AWS, delivering an SMB (optionally NFS) share to our workloads, leveraging S3 as the backend storage.
In the next post, we’ll explore the Storage Gateway – Volume Gateway capabilities.
A volume gateway provides cloud-backed storage volumes that you can mount as Internet Small Computer System Interface (iSCSI) devices from your application servers hosted in VMware Cloud on AWS.
Stay tuned! #ESVR