AWS Storage Gateway for Files integration with VMware Cloud on AWS

SMB/NFS Services for Workloads in VMware Cloud on AWS

In this article I’d like to show you how the native integration between AWS Services and VMware Cloud on AWS can provide you a lot of powerful capabilities. File Services for VMware Cloud on AWS workloads is one of the most common use cases.

We can leverage AWS Storage Gateway – File Gateway to provide our workloads with SMB or NFS shares. This enables us to re-think our approach to the Cloud when it comes to migrate File Servers, store backups or leveraging services such as Athena to analyze our data once stored in S3.

The most common use cases for the File Gateway are: Online Content Repository, Backup to Cloud, Big Data-Machine Learning-Data Processing leveraging files stored in S3, Vertical Applications which creates lot of long-term-retention files.

Architecture and Service Description

In the following picture you can see the Architecture of the solution we are about to implement.
AWS Storage Gateway is a Virtual Appliance that exposes File Services to VMware workloads. It has been historically deployed on-premises, but now that we have VMware Cloud on AWS, we can take advantage of the high speed and low latency connection provided by the ENI that connects our SDDC with all the native AWS services in the Connected VPC.
The Storage Gateway Appliance will expose SMB and/or NFS shares to our workloads hosted in VMC. Frequently accessed files will be cached locally by the appliance while all other files will be stored in Amazon S3.

Simply put, we can now leverage S3 as our File Server, with the Storage Gateway Appliance exposing S3 objects in the form of files, through SMB or NFS shares, to VMware Cloud on AWS Workloads.
There’s a 1:1 mapping between a file in the share and the related object in S3, and the folder structure is preserved.
With our files in S3, we can also leverage S3 versioning, lifecycle policies and cross-region replication. We can think of a File Gateway as a file system mount on S3.
In addition, as we have the “magic” ENI-based routing connection between our SDDC and the Connected VPC in place, we don’t need to configure a Proxy Server to be able to access S3 from the Storage Gateway deployed in VMC. Routing works out of the box and is automatically managed for us between VMC and the Connected VPC.
Powerful, no?

This is the High Level Architecture of the solution we are implementing:

File Gateway in VMware Cloud on AWS - Architecture
File Gateway in VMware Cloud on AWS – Architecture

From a performance perspective, AWS recommends the following: https://docs.aws.amazon.com/storagegateway/latest/userguide/Performance.html#performance-fgw

From an high availability perspective, we can leverage vSphere HA to provide high availability for our File Gateway. You can read more here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Performance.html#vmware-ha
We’ll test vSphere HA with File Gateway later, during the deployment wizard.

Preliminary Steps

Get the VPC Subnet and Availability Zone where the SDDC is deployed

We need to accomplish some preliminary steps to gather some information about our SDDC, that we’ll need later. In addition, we need to configure some Firewall Rules to enable communication between our SDDC and the Connected VPC where we’ll configure our Gateway Endpoint.

As a first step, we need to access our VMware Cloud Services console and access VMware Cloud on AWS.

VMware Cloud Services
VMware Cloud Services

The second step is to access our SDDC clicking on “View Details”. Alternatively, you can click on the SDDC name.

VMware Cloud on AWS SDDC
VMware Cloud on AWS SDDC

Once in our SDDC, we need to select the “Networking & Security” tab.

SDDC details
SDDC details

In the “Networking & Security” tab, we must head to the “Connected VPC” section, where we can find the VCP subnet and AZ that we did choose upon deployment of the SDDC. Our SDDC resides there, therefore every AWS service we will configure in this same AZ will not cause us any traffic charge. We need to keep note of the VPC subnet and AZ as we’ll need this information later.

SDDC Networking & Security
SDDC Networking & Security

Create SDDC Firewall Rules

The second preliminary step we need to perform is to enable bi-directional communication between our SDDC and the Connected VPC through the Compute Gateway (CGW). I’ll not go through the details of the Firewall Rules creation in this post, but simply highlight the result: for the sake of simplicity, in this example we have a rule allowing any kind of traffic from the Connected VPC Prefixes and S3 Prefixes to any destination, and vice-versa. As you can see, both rules are applied to the VPC Interface which actually is the cross-Account ENI connecting the SDDC to the Connected VPC.
If we would like to configure more granular security, we could do this leveraging the information highlighted in the AWS documentation here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Resource_Ports.html

Compute Gateway Firewall Rules
Compute Gateway Firewall Rules

Let’s now have a look at the actual implementation of the File Gateway in VMC and how it works.

Create the Storage Gateway VPC Endpoint

First, we need to access the AWS Management Console for the AWS Account linked to the VMware Cloud on AWS SDDC and select “Storage Gateway” from the AWS Services (hint: start typing in the “Find Services” field and the relevant services will be filtered for you). Make sure you are connecting to the right Region where your SDDC and Connected VPC are deployed.

AWS Management Console
AWS Management Console

If you don’t have any Storage Gateway already deployed, You will be presented with the Get Started page. Click on “Get Started” to create your Storage Gateway. (hint: if you already have one or more Storage Gateways deployed, simply click on “Create Gateway” in the landing page for the service).

AWS Storage Gateway - Getting Started Page
AWS Storage Gateway – Getting Started Page

You will be presented with the Create Gateway wizard. The first step is to choose the Gateway type. In this scenario, we are focusing on File Services and we will select “File Gateway”. Click “Next”.

Select Gateway Type
Select Gateway Type

The second step is to download the OVA image to be installed on our vSphere Environment in VMC. Click on “Download Image”, then click “Next”.

Download Storage Gateway Image for ESXi
Download Storage Gateway Image for ESXi

Deploy the Storage Gateway Virtual Appliance in VMware Cloud on AWS

Now that we have download the ESXi image, we’ll momentarily leave the AWS Console and move to our vSphere Client, to install the Storage Gateway Virtual Appliance. I’m assuming here that we have the VMware Cloud on AWS SDDC already deployed and we have access to our vCenter in the Cloud. SDDC deployment is covered in detail in one of my previous posts here:https://www.esvr.cloud/2018/08/10/vmware-cloud-on-aws-lets-create-our-first-vmware-sddc-on-aws/
Head to the Inventory Object where you want to deploy the Virtual Appliance (e.g. Compute-ResourcePool), right click and select “Deploy OVF Template…”

Deploy OVF Template
Deploy OVF Template

Select the previously downloaded Virtual Appliance. This is named “aws-storage-gateway-latest.ova” at the time of this writing. Click “Next”.

Choose Transit Gateway OVA
Choose Transit Gateway OVA

Provide a name for the new Virtual Machine, then click “Next”.

Provide Virtual Machine Name
Provide Virtual Machine Name

Confirm the Compute Resource where you want to deploy the Virtual Appliance (e.g. Compute-ResourcePool). Then, click “Next”.

Select Compute Resource
Select Compute Resource

In the “Review details” page, click “Next”.

Deploy OVF Template - Review details
Deploy OVF Template – Review details

Select the Storage that will host our Virtual Appliance. In VMware Cloud on AWS this will be “WorkloadDatastore”. Click “Next”.

Workload Datastore
Workload Datastore

Select the destination network for the Virtual Appliance and click “Next”.

Destination Network
Destination Network

In the “Ready to Complete” window, click “Finish” to start the creation of the Storage Gateway Virtual Appliance.

Ready to complete
Ready to complete

We now have our Storage Gateway Appliance in the SDDC’s vCenter inventory. Let’s edit the VM to add some storage to be used for caching. To clarify, in addition to the 80 GB base VMDK, the Storage Gateway Appliance must have at least one additional VMDK of at least 150 GB in size. You can see all the Storage Gateway requirements here: https://docs.aws.amazon.com/storagegateway/latest/userguide/Requirements.html
Select the Storage Gateway VM, select “ACTIONS” then “Edit Settings…”.

Storage Gateway Virtual Appliance - Edit Settings
Storage Gateway Virtual Appliance – Edit Settings

In the “Edit Settings…” window, under Virtual Hardware, add a new disk device by clicking on “ADD NEW DEVICE” and selecting “Hard Disk”.

Add new device - Hard Disk
Add new device – Hard Disk

Select a size of at least 150 GB for the new disk. Then click “OK”.

Set new Hard Disk size
Set new Hard Disk size

Create VPC Endpoint for Storage Gateway

We can now switch back to the AWS Console, where we should be in the “Service Endpoint” page of the File Gateway deployment wizard. In case we’re still in the “Select Platform” window, we can simply click “Next”. As we want to have a private, direct connection between the Storage Gateway vApp and the Storage Gateway Endpoint, we will select “VPC” as our Endpoint Type. Click on the “Create a VPC endpoint” button to open a new window where we can create our endpoint.
A VPC Endpoint is a direct private connection from a VPC to a native AWS Service. With a VPC Endpoint in place, we don’t need an Internet Gateway, NAT Gateway or VPN to access AWS Services from inside our VPC, and instances in the VPC do not require public IP addresses.
A VPC Endpoint for Storage Gateway is based on the PrivateLink networking feature and it is an Interface-based (ENI) Endpoint.

Service Endpoint
Service Endpoint

In the “Create Endpoint” wizard, we have a couple of choices we must make for our Storage Gateway Endpoint: Service category will be “AWS Services”, then we’ll select the same AZ and subnet where our SDDC is deployed (note: we could select more than one AZ and subnet for better resilience of the endpoint, but we would potentially incur in cross-AZ charges and it could make no sense to have cross-AZ resiliency of the File Gateway, unless we also deploy our SDDC in a Stretched Cluster configuration between two AZs). Lastly, we can leave the default security group selected and click on “Create endpoint”.

Create Storage Gateway Endpoint
Create Storage Gateway Endpoint

Once the deployment is finished, we’ll be able to see our VPC Endpoint available in the AWS Console. You can see here that the Endpoint type is “Interface”.

VPC Endpoint in the AWS Console
VPC Endpoint in the AWS Console

We can now switch back to the File Gateway creation wizard, but before that we must take note of the IP address assigned to our Storage Endpoint. We could use either the DNS name or the IP address to configure our File Gateway, I’m choosing to use the IP address in this example, let’s see where we can find the IP address assigned to the ENI (Storage Endpoint). This is visible in the “Subnets” tab, where one ENI is created for each Subnet the VPC Endpoint is attached to.

VPC Endpoint subnet attachment
VPC Endpoint subnet attachment

We can now input the IP address of our VPC Endpoint in the Storage Gateway creation wizard. Then, click “Next”.

Service Endpoint
Service Endpoint

This brings us to the “Connect to Gateway” window. Here, we can input the IP address assigned to the Storage Gateway VM deployed in VMC. Then, click on “Connect to gateway”.

Connect to Gateway
Connect to Gateway

The next step in the wizard is to activate our Gateway. We can review the pre-compiled fields and optionally assign a Tag to our Gateway. When done, click on “Activate Gateway”.

Activate Gateway
Activate Gateway

We’ll get a confirmation message that our Storage (File) Gateway is now active. Additionally, we are presented with the local disk configuration window. In this window we must ensure that one or more disks are allocated to cache to most frequently accessed files locally on the File Gateway itself. When done, click on “Configure logging”.

Configure Cache Disk
Configure Cache Disk

In this example we are not configuring Cloudwatch logging for this File Gateway, for this reason we can leave the default of “Disable Logging”. We can now click on “Verify VMware HA” to verify that our File Gateway can be correctly protected by VMware HA. In VMC we have both VM level and Host level protection, and all the settings are already pre-configured based on best practices. In VMC, vSphere HA is perfectly configured out-of-the-box to provide high availability to our File Gateway. Let’s click on “Verify VMware HA” to actually see this in action.

Gateway Logging
Gateway Logging

We are now getting a message asking us to confirm that we want to test VMware HA and also providing us with a reminder that this step is only needed if the File Gateway is deployed on a VMware HA enabled Cluster. Click on “Verify VMware HA”.

Verify VMware HA
Verify VMware HA

This starts the HA test, simulating a failure inside the File Gateway VM causing it to be restarted by VMware HA. We are immediately notified that the test is in progress.

HA test in progress
HA test in progress

When the test completes, we are notified that it has completed successfully. We can now click on “Save and continue” to close the wizard.

HA test completed successfully
HA test completed successfully

This brings us back to the AWS Console where we can see that our File Gateway has been successfully created.

File Gateway created successfully
File Gateway created successfully

We need to take an additional step before we can actually create our first file share. Until now, we have created a Storage Gateway Endpoint and connected a File Gateway VM to it. To make the Storage Gateway Endpoint capable to route towards S3, we also have to create an S3 Endpoint in the Connected VPC. First step to create an S3 Endpoint is to move back to the AWS Console, Select VPC as the Service, and finally choose “Endpoints”.

VPC Endpoints
VPC Endpoints

Once in the VPC Endpoints windows, we can click on “Create Endpoint”.

Create Endpoint
Create Endpoint

We can now set our Endpoint parameters: select com.amazonaws.<region_code>.s3 as the Service Name, e.g. com.amazonaws.eu-central-1.s3 (hint: filtering on “S3”, it will be the only option you get). Note how the S3 Endpoint is a “Gateway” Endpoint, for this reason it’s based on route table entries.
We can accept the proposed VPC and the main route table (these are the default options). We can then click on “Create Endpoint”.
The result of this configuration is that we now have a new entry in the default routing table of the VPC, with destination the S3 prefixes list (pl-xxxxxxx) targeted to the S3 VPC Endpoint (vpce-id).

Create S3 VPC Endpoint

S3 VPC Endpoint
S3 VPC Endpoint

We could optionally manage access to the Endpoint attaching a VPC Endpoint Policy to it. We’ll leave the default option in this example.

Once we have created the S3 VPC Endpoint, still in the AWS Console we can move back to the Storage Gateway Service window. Here, we can finally create our first file share. Let’s first add the File Gateway to our Active Directory, this way we can leverage AD authentication and authorization using ACLs.
To continue and add our File Gateway to our Active Directory, select “Edit SMB settings” from the “Actions” menu.

Edit SMB settings
Edit SMB settings

Here, we’ll add the required fields about our own Active Directory Domain and then click on “Save”. Note how the Active Directory status in this phase is still “Detached”.

Active Directory parameters
Active Directory parameters

A “Join domain request sent” message will be shown and the Active Directory status will change to “Join In Progress”.

Join AD in progress
Join AD in progress

Once the File Gateway has been joined to Active Directory, the wizard will show us a “Successfully joined domain” message, and the Active Directory status will change to “Joined”.

AD successfully joined
AD successfully joined

We can now click on the “Create file share” button to create our first share.

Create file share
Create file share

In the “Configure file share settings” window, we must input the name of the S3 bucket that will host our files (note: the bucket must be already in place. Creation of the S3 bucket is out of scope for this post). Click “Next” when done.

Configure File Share settings
Configure File Share settings

In the next window we can select the S3 storage class we want to use. We can safely leave the default “S3 Standard” selected for our scenario. Additionally, we can choose if we want to create a new IAM role or use an existing one, to access our S3 bucket. When done, click “Next”.

Select S3 Storage Class
Select S3 Storage Class

In the following window, we can change SMB share settings such as authentication method, read/write access and access controls. We can accept the defaults and have Active Directory authentication, read and write access and access control managed by ACLs.
By default, all Active Directory authenticated users are granted read and write access to our share. We can edit this setting to set a more granular access control based on single users or group membership, if needed. Click “Create file share” when done.

SMB share setting
SMB share setting

Here we are. Our SMB share is ready to be used. We can now provide File Services to our workloads hosted in VMware Cloud on AWS. Let’s take note of the command line proposed in the low-end side if the window as we’ll use it momentarily to map the share to a drive letter in a Windows Virtual Machine.

File Services for VMware Cloud on AWS - Share Created
File Services for VMware Cloud on AWS – Share Created

Let’s move to a Windows VM hosted in VMC. In my case, the VM is already joined to my Active Directory domain and I’m connected using my Domain User Account credentials.
I’m using the previously copied command and pasting it in a command prompt, this will map our share to a drive letter in the Windows machine.

Share mapped to Windows drive
Share mapped to Windows drive

The command completed successfully and we now have our SMB share, delivered by the File Gateway and backed by S3.
We can access our P: drive from the Windows File Explorer and upload a file.

File uploaded to the SMB share
File uploaded to the SMB share

We can double check the File Gateway functionality by accessing the S3 bucket that is backing our share and checking if our uploaded file is there. And there is, as expected.

S3 bucket
S3 bucket

Configure vSAN Policy for the File Gateway VM

The last step we should make to follow AWS best practices is to reserve all disk space for the File Gateway cache disk(s). AWS recommends to create cache disks with Thick Provisioned format, but as we are leveraging vSAN in VMC we don’t have Thick Provisioning available in the traditional sense. We must use Storage Policies to reserve all disk space for the cache disk. The first step is to go into our vSphere Client and select “Policies and Profiles” from the main Menu.

Policies and Profiles
Policies and Profiles

In the “Policies and Profiles” page, under “VM Storage Policies”, select “Create VM Storage Policy”.

Create VM Storage Policy
Create VM Storage Policy

In the “Create VM Storage Policy”, select a name for the policy and click “Next”.

Storage Policy Name and description
Storage Policy Name and description

In the “Policy Structure” window, set the flag on “Enable rules for vSAN storage”, then click “Next”.

Storage Policy Structure
Storage Policy Structure

In the vSAN window, under “Availability” configuration, we can leave the default settings and switch to the “Advanced Policy Rules” tab.

vSAN - Availability
vSAN – Availability

Once in the “Advanced Policy Rules” tab, we can change the “Object space reservation” field to “Thick provisioning”, leaving all the other fields at their defaults. Then, click “Next”.

vSAN - Advanced Policy Rules
vSAN – Advanced Policy Rules

Select the “WorkloadDatastore” and click “Next”.

Storage Compatibility
Storage Compatibility

In the next window we can review all the settings we have made and click “Finish”.

New Storage Policy - Review and Finish
New Storage Policy – Review and Finish

We can now move to our File Gateway Virtual Machine and select “Edit Settings…” under the “ACTIONS” Menu.

Edit File Gateway VM settings
Edit File Gateway VM settings

Under the “Virtual Hardware” tab, we can now select the hard disk we assigned to the File Gateway as the cache volume, and assign the newly created Storage Policy to it. Once done, click “OK”. This will pre-assign all the configured disk space to that disk, replacing the default thin provisioning policy.

Assign new Storage Policy to cache disk
Assign new Storage Policy to cache disk

This concludes this post.
We have created a File Gateway in VMware Cloud on AWS, delivering an SMB (optionally NFS) share to our workloads, leveraging S3 as the backend storage.
In the next post, we’ll explore the Storage Gateway – Volume Gateway capabilities.
A volume gateway provides cloud-backed storage volumes that you can mount as Internet Small Computer System Interface (iSCSI) devices from your application servers hosted in VMware Cloud on AWS.
Stay tuned! #ESVR

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.