Resetting Your Graph Data in Amazon Neptune: A Quick Guide

Resetting Your Graph Data in Amazon Neptune: A Quick GuideMore Info

In the realm of enterprise application development, particularly for those engaged with graph applications using Amazon Neptune, keeping your graph data fresh is essential. Regularly deleting and reloading graph data ensures that you remain aligned with the latest updates, such as evolving relationships between nodes, or switching out test data for production datasets. Previously, developers faced the daunting task of identifying changes between large datasets and incrementally inserting them into graphs, or resorting to the labor-intensive process of deleting the entire graph database and creating new Neptune clusters. These methods not only consumed valuable time but also introduced added complexity, such as adjusting client configurations and managing policies for the new clusters. However, with the arrival of Neptune engine release 1.0.4.0, this process can now be automated, making it significantly more efficient.

Historically, the deletion of graphs within Amazon Neptune could be a sluggish endeavor due to its transactional nature. When utilizing the Resource Description Framework (RDF) for data deletion, a cascading delete through SPARQL was necessary. Conversely, if you were working with property graphs, you might initiate the deletion with the command g.V().drop() in the Gremlin console. Yet, depending on your graph’s size, this command could time out or trigger out-of-memory errors, leading to a frustrating cycle of extending query timeouts and repeated attempts. Often, this would require rewriting queries to dismantle the graph database in smaller sections, either by sequentially dropping edges before vertices or crafting a multi-threaded Gremlin query.

The newly implemented database reset functionality in Neptune allows for the complete removal of data from your graph database through REST APIs or built-in commands available via Neptune Workbench. This reset capability is compatible with both property graphs and RDF graphs. In this article, we will delve into the specifics of the database reset process and the implications of its execution.

Solution Overview

The database reset process in Neptune consists of two steps. The initial step involves issuing a time-sensitive token (valid for 60 minutes), while the second step utilizes this token to perform the actual reset. This two-step method provides a safeguard against inadvertent deletions.

REST APIs

Amazon Neptune introduces a new /system endpoint for executing the database reset in two stages: initiateDatabaseReset and performDatabaseReset.

initiateDatabaseReset

This step must be executed at the /system endpoint via the curl command. Below is the code snippet:

curl -X POST 
-H 'Content-Type: application/json' http://neptune-writer-endpoint:8182/system 
-d '{ "action" : "initiateDatabaseReset" }'

Alternatively, you can use:

curl -X POST 
-H 'Content-Type: application/x-www-form-urlencoded' 
https://neptune-writer-endpoint:8182/system 
-d 'action=initiateDatabaseReset'

The response will be in JSON format and will provide a reset token:

{
"status" : "200 OK",
"payload" : {
"token" : "ef478d76-d9da-4d94-8ff1-08d9d4863aa5"
}
}

performDatabaseReset

The second step uses the token obtained from the initiateDatabaseReset command. This action also takes place at the /system endpoint via curl. Here’s how it looks:

curl -X POST -H 'Content-Type: application/x-www-form-urlencoded' 
https://neptune-writer-endpoint:8182/system 
-d 'action=performDatabaseReset&token=ef478d76-d9da-4d94-8ff1-08d9d4863aa5'

Or:

curl -X POST -H 'Content-Type: application/json' https://neptune-writer-endpoint:8182/system -d '
{ 
"action": "performDatabaseReset" ,
"token" : "ef478d76-d9da-4d94-8ff1-08d9d4863aa5"
}'

The JSON response will confirm the reset:

{
"status" : "200 OK"
}

Keep in mind that the token is valid for 60 minutes. Should it expire, an error response will be returned:

{"code":"InvalidParameterException","requestId":"4cb9c101-07bc-4317-d897-187978fbc270",
"detailedMessage":"System command parameter 'token': '4cb9c101-07bc-4317-d897-187978fbc270' does not match database reset token"}

Workbench Magic Commands

You can also initiate a database reset through the Neptune Workbench using the two-step process:

db_reset –generate-token

This command generates a time-sensitive token:

%db_reset --generate-token

The response will be:

{
"status" : "200 OK",
"payload" : {
"token" : "ef478d76-d9da-4d94-8ff1-08d9d4863aa5"
}
}

db_reset –token

To trigger the reset, use the token from the previous command as input:

%db_reset --token ef478d76-d9da-4d94-8ff1-08d9d4863aa5

The response will confirm the action:

{
"status" : "200 OK"
}

How Neptune Performs a Database Reset

When executing a database reset, Amazon Neptune follows a series of steps to ensure a successful outcome:

  1. The cluster marks the reset status in the database.
  2. A JSON response with a 200 status is sent to the client.
  3. The cluster halts new incoming requests.
  4. The cluster attempts to cancel any queued queries.
  5. The cluster restarts.
  6. The cluster drops the existing database and recreates a blank one.
  7. The cluster is ready to accept incoming read/write requests.

This operation typically takes about 60 to 90 seconds before the server is prepared to handle new requests. It’s advisable for client applications to gracefully manage connection interruptions and reconnect as needed before proceeding with read or write operations.

In conclusion, the newly introduced database reset feature allows for swift deletion of all data from your graph. This streamlines the development of graph applications, enabling developers to utilize REST APIs or built-in commands via Neptune Workbench for quick data removal. You can avoid the hassle of adjusting query timeouts or creating complex, multi-threaded queries for deleting edges and vertices. For more insights on this topic, check out another blog post here and learn from the experts at this authority site. Additionally, for further resources, visit this Reddit thread that offers great insights.

Location: Amazon IXD – VGT2, 6401 E Howdy Wells Ave, Las Vegas, NV 89115


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *