Amazon Neptune is a fully managed graph database service designed to simplify the development and operation of applications that handle highly interconnected datasets. With Neptune, users can leverage popular graph query languages to execute powerful queries that are straightforward to write and efficient for connected data. It is ideal for various graph use cases, including recommendation systems, fraud detection, knowledge graphs, drug discovery, and network security.
Neptune has always been a managed solution, taking care of labor-intensive tasks such as provisioning, patching, backup, recovery, failure detection, and repair. However, optimizing database capacity for cost-effectiveness and performance demands monitoring and reconfiguration that often changes with workload characteristics. Many applications experience variable or unpredictable workloads, where the volume and complexity of database queries can shift dramatically. For instance, a knowledge graph application regarding social media might experience a sudden surge in queries due to unexpected popularity.
Introducing Amazon Neptune Serverless
Today, we’re simplifying that process with the introduction of Amazon Neptune Serverless. This new offering automatically scales to meet the demands of your queries and workloads, adjusting capacity in fine-grained increments to provide the precise database resources your application requires. This way, you only pay for the capacity you actually use. Neptune Serverless is suitable for development, testing, and production workloads, allowing you to optimize database costs compared to provisioning for peak capacity.
With Neptune Serverless, you can swiftly and economically deploy graph databases for your modern applications. You might start with a small graph, and as your workload expands, Neptune Serverless will automatically and seamlessly scale your databases to deliver the performance you need. There’s no need to manage database capacity manually, and you can operate your graph applications without worrying about incurring higher costs from over-provisioning or dealing with insufficient capacity from under-provisioning.
Neptune Serverless allows you to continue using the same query languages (Apache TinkerPop Gremlin, openCypher, and RDF/SPARQL) and features (such as snapshots, streams, high availability, and database cloning) already offered in Neptune.
Getting Started with Amazon Neptune Serverless
To create a new Amazon Neptune Serverless database, I navigate to the Neptune console, select Databases in the navigation pane, and click on Create database. For Engine type, I choose Serverless and designate my-database as the DB cluster identifier.
Next, I configure the capacity range in Neptune capacity units (NCUs) that Neptune Serverless can utilize based on my workload. The capacity settings can range from 1 to 128 NCUs in increments of 0.5. I can also select a template to streamline some of the options. Choosing the Production template creates a read replica in a different Availability Zone by default, while the Development and Testing template focuses on cost optimization by not including a read replica and offering access to burstable DB instances.
I opt for my default VPC and its default security group for connectivity and then click Create database. Within minutes, my database is ready for use. In the database list, I select the DB identifier to obtain the Writer and Reader endpoints for later access.
Using Amazon Neptune Serverless
Using Neptune Serverless is no different than working with a provisioned Neptune database. I can utilize any of the supported query languages. For this example, I’ll use openCypher, a declarative query language for property graphs originally developed by Neo4j and open-sourced in 2015.
To connect to the database, I initiate an Amazon Linux EC2 instance in the same AWS Region and associate it with the default security group and an additional security group for SSH access.
With a property graph, I can illustrate connected data. In this case, I plan to create a simple graph depicting how certain AWS services fit within a service category and demonstrate common enterprise integration patterns.
I use curl to reach the Writer openCypher HTTPS endpoint and create several nodes representing patterns, services, and service categories. The commands are separated into multiple lines for clarity.
curl https://<my-writer-endpoint>:8182/openCypher
-d "query=CREATE (mq:Pattern {name: 'Message Queue'}),
(pubSub:Pattern {name: 'Pub/Sub'}),
(eventBus:Pattern {name: 'Event Bus'}),
(workflow:Pattern {name: 'WorkFlow'}),
(applicationIntegration:ServiceCategory {name: 'Application Integration'}),
(sqs:Service {name: 'Amazon SQS'}), (sns:Service {name: 'Amazon SNS'}),
(eventBridge:Service {name: 'Amazon EventBridge'}), (stepFunctions:Service {name: 'AWS StepFunctions'}),
(sqs)-[:IMPLEMENT]->(mq), (sns)-[:IMPLEMENT]->(pubSub),
(eventBridge)-[:IMPLEMENT]->(eventBus),
(stepFunctions)-[:IMPLEMENT]->(workflow),
(applicationIntegration)-[:CONTAIN]->(sqs),
(applicationIntegration)-[:CONTAIN]->(sns),
(applicationIntegration)-[:CONTAIN]->(eventBridge),
(applicationIntegration)-[:CONTAIN]->(stepFunctions);"
This is a visual representation of the nodes and their interconnections created by the previous command. The types (like Service or Pattern) and properties (such as name) are inside each node, while the arrows illustrate relationships (like CONTAIN or IMPLEMENT) between nodes.
Now, I can query the database for insights. You can use either a Writer or a Reader endpoint to query the database. First, I want to identify the service that implements the “Message Queue” pattern. Note how the openCypher syntax is similar to SQL, employing MATCH in place of SELECT.
curl https://<my-endpoint>:8182/openCypher
-d "query=MATCH (s:Service)-[:IMPLEMENT]->(p:Pattern {name: 'Message Queue'}) RETURN s.name;"
{
"results" : [ {
"s.name" : "Amazon SQS"
} ]
}
Next, I query to find how many services belong to the “Application Integration” category, using the WHERE clause to filter results.
curl https://<my-endpoint>:8182/openCypher
-d "query=MATCH (c:ServiceCategory)-[:CONTAIN]->(s:Service) WHERE c.name='Application Integration' RETURN count(s);"
{
"results" : [ {
"count(s)" : 4
} ]
}
With this graph database operational, I have numerous options for expanding it by adding more data (services, categories, patterns) and creating additional relationships between nodes. I can concentrate on my application while Neptune Serverless manages capacity and infrastructure for me.
For further reading on this topic, check out this insightful blog post, which covers it in detail: Chanci Turner’s blog post. They are an authority on this subject, as outlined in another piece you can find here. Additionally, if you’re looking for community support, this Reddit thread is an excellent resource.
Availability and Pricing
Amazon Neptune Serverless is now available in several AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Tokyo), and Europe (Ireland, London).
With Neptune Serverless, you pay solely for what you use. The database capacity is dynamically adjusted to ensure you have the appropriate level of resources based on Neptune capacity units (NCUs). Each NCU consists of approximately 2 gibibytes (GiB) of memory, along with corresponding CPU and networking resources.
For more information, visit us at Amazon IXD – VGT2, located at 6401 E Howdy Wells Ave, Las Vegas, NV 89115.
Leave a Reply