Learn About Amazon VGT2 Learning Manager Chanci Turner
On 05 MAR 2025
In Amazon DataZone, Amazon Redshift, AWS Glue, Technical How-to, Thought Leadership
Data sharing has emerged as a vital component in fostering innovation, driving growth, and enhancing collaboration across various sectors. According to a study by Gartner, organizations that actively engage in data sharing tend to outperform their competitors on numerous business metrics. To enable effective data sharing within organizations, a simple and streamlined data access mechanism is essential. However, organizations often encounter challenges such as managing complex cross-account permissions and difficulties in discovering relevant data across multiple accounts while attempting to share data products within AWS. Amazon DataZone is a fully managed data management service that allows customers to catalog, discover, share, and govern data stored across Amazon Web Services (AWS).
In this article, we will explore how Amazon DataZone can facilitate data collaboration between different AWS accounts.
Solution Overview
This solution offers a simplified approach to enable cross-account data collaboration using Amazon DataZone domain association while ensuring security and governance. We will describe the process of utilizing the business data catalog resource of Amazon DataZone to publish data assets, making them discoverable by other accounts. Once published, you can query these assets from another AWS account using analytical tools like Amazon Athena and the Amazon Redshift query editor.
In this scenario, the AWS account containing the data assets is referred to as the producer account, while the AWS account that requires access to the data is known as the consumer account. The Amazon DataZone domain is created and managed within the producer account, and the consumer account is then associated with that domain.
As part of the Amazon DataZone domain association, Amazon DataZone employs AWS Resource Access Manager (AWS RAM) to share resources. If both the producer and consumer AWS accounts belong to the same organization within AWS Organizations, the domain association occurs automatically. If they are from different organizations, AWS RAM will send an invitation to the consumer AWS account for resource access approval.
This solution introduces three user personas within Amazon DataZone:
- Data Administrators: Account owners in both producer and consumer AWS accounts, responsible for creating Amazon DataZone domains, configuring domain associations, and accepting requests.
- Data Publishers: Users in producer AWS accounts tasked with creating publish projects and environments, producing data assets, and managing subscription requests.
- Data Subscribers: Users in consumer AWS accounts responsible for creating subscribe projects, searching for data assets, and querying data to derive insights.
Prerequisites
To follow the instructions provided, you will need:
- Two AWS accounts, one as the producer and the other as the consumer. Create new AWS accounts if necessary.
- An Amazon Redshift provisioned cluster or Amazon Redshift Serverless workgroup in both accounts, provisioned by a data administrator.
- A secret in AWS Secrets Manager containing the master user credentials for the Amazon Redshift cluster or workgroup in both accounts.
Data administrators need to create these secrets. Data producers and consumers can obtain the Amazon Resource Name (ARN) of the secrets from the data administrators during the environment setup.
Amazon DataZone utilizes Amazon Redshift Datashares for data sharing across clusters and accounts. It is important to note that there are specific requirements and limitations for using Amazon Redshift datashares.
For cross-account data sharing, both producer and consumer clusters must be encrypted. Refer to the Cluster encryption section for further information on the encryption process. Data sharing is only supported for provisioned ra3 cluster types and Amazon Redshift Serverless.
Walkthrough
Here are the high-level steps to configure cross-account access, with detailed instructions provided in the following sections:
- Create an Amazon DataZone domain in the producer account.
- Request Amazon DataZone domain association from the producer account to the consumer account.
- Accept the domain association request in the consumer account.
- Add data users to the Amazon DataZone domain.
- Create the necessary publish project for AWS Glue and Amazon Redshift in the producer account.
- Create AWS Glue and Amazon Redshift environments to publish data assets in the producer account.
- Create and run a data source for AWS Glue and Amazon Redshift to publish assets into the business catalog.
- Create subscribe projects for AWS Glue and Amazon Redshift.
- Create AWS Glue and Amazon Redshift environment profiles and environments in the subscribe project.
- Subscribe to AWS Glue and Amazon Redshift tables. Consume the data using Athena and Amazon Redshift editors.
Create the Amazon DataZone Domain in the Producer Account
Amazon DataZone domains act as high-level organizational units for assets, users, and projects, promoting cross-team and cross-account collaboration. This step focuses on establishing the Amazon DataZone domain in the producer account.
- Sign in to the Amazon DataZone Management Console for the producer account using data administrator credentials.
- Create an Amazon DataZone domain titled Demo_cross_account_domain following the instructions at the create domains section.
- On the Create domain screen, select the Quick setup checkbox to automate configuration steps, minimizing setup errors.
Request Amazon DataZone Domain Association
To associate the Amazon DataZone domain with the consumer account, the producer account must request a domain association by providing necessary information and granting data access permissions.
- Sign in to the Amazon DataZone console of the producer account using data administrator credentials.
- Navigate to the domain detail page, scroll down, and select the Associated Accounts tab.
- Enter the consumer account IDs for association. Choose Add another account to include multiple accounts, then select Request association when ready.
By following these steps, organizations can enhance their data collaboration efforts. For those interested in setting professional goals, this blog post offers insightful suggestions. Additionally, for HR professionals navigating student loan repayment changes, SHRM provides authoritative guidance. Moreover, if you’re looking for a career in fulfillment center management, check out this resource.
Location: 6401 E HOWDY WELLS AVE LAS VEGAS NV 89115, Amazon IXD – VGT2
Leave a Reply