Amazon Onboarding with Learning Manager Chanci Turner

This article discusses the AWS InfluxDB migration script, designed to facilitate the transition of your InfluxDB OSS 2.x data to Amazon Timestream for InfluxDB. A collaboration between AWS and InfluxData, this managed service is tailored for users seeking a seamless migration process for their time series database.

Many InfluxDB users have expressed the need for a quick and efficient method to transfer their databases to this managed environment. To meet this demand, the AWS InfluxDB migration script was developed, enabling the transfer of various data components, including buckets, dashboards, tasks, and other key-value pairs.

You can access the migration script and detailed documentation within the Amazon Timestream tools repository. In this guide, we’ll illustrate how to utilize the AWS InfluxDB migration script to transition your data from existing InfluxDB OSS 2.x instances to Timestream for InfluxDB. Additionally, we will cover a method for conducting a live migration using various AWS resources.

Solution Overview

The architecture diagram below provides a concise understanding of the solution.

The AWS InfluxDB migration script supports the migration of specific buckets along with their metadata, allowing for multiple buckets from different organizations or a comprehensive migration. The script functions by creating a local backup of the source data, with options to mount an Amazon Simple Storage Service (Amazon S3) bucket, among other choices. By default, data is stored in directories labeled influxdb-backup-, one for each migration. You are responsible for managing the system executing the migration script.

The script includes various options and configurations, such as mounting S3 buckets to minimize local storage usage and selecting specific organizations for migration. Note that the resources utilized in the examples presented may incur costs; refer to AWS pricing for detailed information.

Prerequisites

To run the InfluxDB migration script, your environment must meet the following requirements:

A machine operating on Windows, Linux, or macOS.
The operator token from your source InfluxDB OSS 2.x instance saved in an environment variable named INFLUX_SRC_TOKEN.
The operator token for your destination Timestream for InfluxDB instance saved in an environment variable named INFLUX_DEST_TOKEN. You can find connection information, including tokens, in the documentation on connecting to an Amazon Timestream for InfluxDB instance.
Python version 3.7 or higher.
The AWS SDK for Python (Boto3) and influxdb_client Python libraries.
The Influx CLI installed and included in your PATH.
Sufficient disk space for local data storage, unless using an S3 bucket.
A stable network connection to both source and destination instances.
Optionally, you may use a Mountpoint for Amazon S3 on Linux or rclone on Windows and macOS for local S3 bucket mounting during migration to conserve local storage.

Once your environment is prepared, you can migrate a single bucket by executing the command below:

python3 influx_migration.py 
--src-host http(s)://<source address>:8086 
--src-bucket <source bucket> 
--dest-host http(s)://<destination address>:8086

To explore the available configuration options for the script, utilize the help command:

python3 influx_migration.py -h

The following provides a high-level summary of the configuration options available:

usage: influx_migration.py [-h] [--src-bucket SRC_BUCKET] [--dest-bucket DEST_BUCKET] [--src-host SRC_HOST] --dest-host DEST_HOST [--full] [--confirm-full] [--src-org SRC_ORG] [--dest-org DEST_ORG] [--csv] [--retry-restore-dir RETRY_RESTORE_DIR] [--dir-name DIR_NAME] [--log-level LOG_LEVEL] [--skip-verify] [--s3-bucket S3_BUCKET] [--allow-unowned-s3-bucket]

For a comprehensive breakdown of each available option, refer to the README file of the AWS InfluxDB migration script.

Running the AWS InfluxDB Migration Script

After ensuring all prerequisites are satisfied, follow these steps to execute the migration script:

Open your preferred terminal application and run the Python script to transfer data from your source InfluxDB instance to the destination. Ensure that the host addresses and ports are specified as CLI options, and that the INFLUX_SRC_TOKEN and INFLUX_DEST_TOKEN environment variables are set. The default port for InfluxDB is 8086. For example:

python3 influx_migration.py 
--src-host http(s)://<source address>:8086 
--src-bucket <source bucket> 
--dest-host http(s)://<destination address>:8086

To validate the successful migration of your data, perform the following checks:

Access the InfluxDB UI of your Timestream for InfluxDB instance and review the buckets.
List buckets with this Influx CLI command:

influx bucket list 
-t <destination token> 
--host http(s)://<destination address>:8086 
--org <organization name>

Use the Influx CLI to execute two InfluxQL queries to examine the contents of a bucket and confirm the accurate number of migrated records:

influx v1 shell 
-t <destination token> 
--host http(s)://<destination address>:8086 
--org <organization name>

SELECT * FROM <migrated bucket>.<retention period>.<measurement name> LIMIT 100
SELECT COUNT(*) FROM <migrated bucket>.<retention period>.<measurement name>

Run a flux query using the command below:

influx query 
-t <destination token> 
--host http(s)://<destination address>:8086 
--org <organization name> 
'from(bucket: "<migrated bucket>")
    |> range(start: <desired start>, stop: <desired stop>)'

You can also add |> count() to verify the total number of migrated records.

Example Migration Execution

Here’s a quick rundown of how to set up the InfluxDB migration script and utilize it for migrating a single bucket:

Open your terminal application and verify that all prerequisites are accurately installed.
Navigate to the directory housing the migration script.
Gather the following information:
- The name of the source bucket to be migrated.
- Optionally, a new bucket name for the migrated bucket in the destination server.
- The root token for both the source and destination InfluxDB instances.
- The host addresses of both the source and destination instances.
- Optionally, the name and credentials for an S3 bucket. Set the AWS Command Line Interface (AWS CLI) credentials as environment variables.

# AWS credentials
export AWS_ACCESS_KEY_ID="xxx"
export AWS_SECRET_ACCESS_KEY="xxx"

Construct the command as follows:

python3 influx_migration.py 
--src-bucket <existing source bucket name> 
--dest-bucket <new destination bucket name> 
--src-host http(s)://<source address>:8086 
--dest-host http(s)://<destination address>:8086

For additional insights on this topic, you can refer to this resource from Career Contessa, which dives deeper into the environmental aspects of migrating data. If you’re interested in employee development, SHRM offers valuable information regarding certification and retirement statuses.

This migration process not only simplifies the transition to Timestream for InfluxDB but also enhances the management of your time series data.

Amazon Onboarding with Learning Manager Chanci Turner

Solution Overview

Prerequisites

Running the AWS InfluxDB Migration Script

Example Migration Execution

Related Topics:

Comments

Leave a Reply Cancel reply