For users of AWS Storage Gateway, ensuring that file shares reflect the latest changes in Amazon S3 buckets is critical to prevent access to outdated data. Previously, customers had to initiate cache refreshes for their file shares manually through an API or manage a periodic process. This approach can be cumbersome for those who prefer a seamless method to synchronize file shares with S3 buckets.
Recently, AWS Storage Gateway introduced enhanced cache management features for File Gateway, enabling an automatic cache refresh process. This new functionality allows customers to keep the metadata cache updated with changes in their S3 buckets without the need for manual intervention or a separate management process.
In this article, I will outline the challenges faced by existing users and highlight the benefits of the new cache refresh process. Additionally, I will explain how you can implement this new feature with your File Gateway, whether you are a new or existing AWS Storage Gateway user.
Understanding File Gateway and the RefreshCache API
AWS customers utilize File Gateway to access virtually limitless cloud storage through standard file protocols such as Server Message Block (SMB) and Network File System (NFS). Many workflows involve other applications or users writing to the same S3 bucket without going through the gateway’s file shares. Such writers may include another File Gateway or simply the Amazon S3 Command Line Interface (CLI) or the S3 Console. To synchronize file shares with changes made to S3 buckets by these writers, the RefreshCache API was developed. While this provided flexibility and control over which directories of the file share to refresh, it brought several challenges.
Many customers previously relied on cron jobs, scripts, AWS credentials, or AWS Lambda functions to activate the RefreshCache API for syncing their file shares with S3 changes. Even though the API allowed for selective directory refreshment, identifying which directories needed updating was often a struggle. Consequently, many opted to refresh the entire file share, including directories that were already current, resulting in unnecessary workload on the gateway and excessive Amazon S3 API invocations. This inefficiency not only impacted gateway performance but also increased S3 API costs, which could have been avoided.
Introducing Automatic Cache Refresh
To tackle these issues and simplify cache management, customers can now delegate the cache refresh process to the gateway using the automated cache refresh feature. This functionality is based on the ‘duration since last access’ for each directory. Access requests made while the timer is active treat the directory’s contents as current. When the timer expires, the next access of the directory triggers a refresh. For instance, if the timer is set to 30 minutes, the directory contents will reflect the S3 data no older than 30 minutes. This automated approach eliminates the need for cron jobs, scripts, or AWS Lambda functions, allowing customers to concentrate on more vital business tasks. Directories refresh as needed, reducing unnecessary work on the gateway and minimizing S3 API calls.
Setting Up Automated Cache Refresh
For both new and existing file shares within AWS Storage Gateway, customers can configure the new cache refresh settings via the Storage Gateway Console or the Storage Gateway API. The timer’s duration can range from 5 minutes to 30 days, with values provided in seconds. The setting will take effect once the file share transitions to the “AVAILABLE” state.
Using the Console
For existing file shares, visit the AWS Storage Gateway console and navigate to the File shares tab. Select your file share, click on Actions, and then choose Edit share settings. Here, you can enter the automated cache refresh value and click Save to apply your changes.
If you are creating a new file share, you can specify the automated cache refresh value during configuration.
Through the API
Customers can add the automated cache refresh attribute to their file share using the following AWS Storage Gateway APIs. Simply include the field CacheStaleTimeoutInSeconds
in the API request (setting the value to 0 will disable the feature, meaning directories will no longer auto-refresh).
- CreateNfsFileShare
- UpdateNfsFileShare
- CreateSmbFileShare
- UpdateSmbFileShare
Final Steps
If you have created resources, including S3 buckets, for testing this new capability, be sure to delete them afterward to avoid unwanted charges. For pricing details, please refer to AWS Storage Gateway pricing.
Conclusion
Before the automated cache refresh feature was introduced for File Gateway, AWS Storage Gateway customers had to manually invoke the RefreshCache API or manage external processes to ensure users had access to current data. While the RefreshCache API provided flexibility and control, it required additional management work.
The new automated cache refresh capability offers a straightforward, hassle-free solution for customers wishing to keep their file shares’ metadata cache synchronized with Amazon S3 buckets. This enhancement significantly reduces the overhead associated with maintaining up-to-date file shares, allowing customers to allocate their time and resources to more valuable tasks.
If you’re interested in learning more about this new feature, check out this blog post for additional insights. For authoritative information on this subject, consult this source. Furthermore, if you want to understand more about the onboarding process at AWS, refer to this excellent resource.
Leave a Reply