Learn About Amazon VGT2 Learning Manager Chanci Turner
In today’s digital landscape, the last mile represents the crucial physical touchpoint a business has with its customers, offering a significant chance to stand out in the competitive market. As e-commerce expands, consumer expectations are evolving rapidly. Shoppers across the globe are increasingly seeking enhanced last-mile experiences, including same-day delivery, shorter delivery timeframes, and real-time shipment tracking. Consequently, both established companies and emerging startups in the parcel delivery sector are redefining their last mile offerings, setting higher benchmarks for the entire delivery market.
However, the last mile is costly; ineffective routing and unsuccessful delivery attempts can significantly inflate expenses. Remarkably, last mile costs can constitute up to 50% of total fulfillment expenses, which encompass pickup, line-haul, and sorting. Failed deliveries can more than double these costs, necessitating redelivery attempts when customers are unavailable. Four primary challenges drive these expenses: (1) fluctuations in customer demand lead to varying delivery volumes, resulting in low parcel densities and elevated costs, especially for providers operating within fixed delivery zones; (2) existing routing tools often prioritize the shortest route but struggle to balance travel time, costs, and adherence to time constraints; (3) standard routing software typically lacks the capability for rerouting once the delivery vehicle has departed the sorting center; and (4) many businesses continue to rely on spreadsheets for last mile planning rather than automating the entire process.
In response to evolving consumer demands and the need to maximize asset utilization for drivers and vehicles to lower delivery costs, companies are increasingly turning to technology for last mile optimization. In this blog, we present the AWS Dynamic Delivery Planner (DDP), an innovative last mile routing solution that equips operators with faster delivery options, enhanced reliability, reduced costs, and increased flexibility.
Dynamic Delivery Planner Overview
Inspired by Amazon’s advancements in last mile logistics, AWS has developed a Last Mile Routing solution designed for its customers, integrating machine learning (ML) into the last mile process. DDP offers users optimal route sequencing, delivery time windows, and real-time routing capabilities. We utilized reinforcement learning (RL) and graph neural networks (GNNs) on real-world data from the Amazon Last Mile Routing Research Challenge, available publicly at Open Data on AWS. The datasets include package details, destination locations, parcel specifications, preferred customer timeframes, expected service durations, and zone identifiers.
DDP excels in real-time rerouting, adapting to ever-changing traffic conditions and customer schedules while drivers are en route. By implementing two online policy improvement strategies to adhere to time window constraints, we have advanced existing state-of-the-art research on the traveling salesperson problem (TSP) using policy gradient-based RL algorithms. DDP learns to select optimal delivery routes based on historical data or determine the most effective route order within a specified delivery timeframe. This can involve identifying the most efficient delivery path, consolidating multiple routes into a single delivery attempt, or planning several delivery attempts using an optimal order combination.
Utilizing Amazon SageMaker, AWS’s machine learning service, we train and refine the reinforcement learning model across multiple GPUs and instances through SageMaker’s data parallelism library. The training employs the REINFORCE policy gradient method, with model inference conducted via Amazon SageMaker’s real-time endpoint or through batch transform jobs.
As an illustration of DDP’s capabilities, the accompanying map displays a route sequence with 220 drop-off points generated by DDP based on the Amazon Last Mile Routing Research Challenge dataset. Blue dots indicate stops, while the arcs connecting them represent the travel sequence. Each arc begins in green and concludes in red. The model executes in approximately 10 seconds under time window constraints and less than 1 second without them.
Unique Features of Dynamic Delivery Planner
The TSP is a well-researched combinatorial optimization challenge with numerous applications in supply chain management (such as routing and scheduling). It seeks to find the most efficient route visiting a set of cities and returning to the start, visiting each city exactly once. Given its NP-hard nature, multiple approximation algorithms have emerged to identify near-optimal solutions. DDP advances conventional TSP solutions by pursuing three additional objectives: (1) to create a route sequence that is both cost-efficient and favorable for the driver; (2) to efficiently calculate time window constraints; and (3) to enable district optimization to decrease total travel time and enhance delivery efficiency. To achieve these objectives, we have incorporated unique features into DDP.
Cost-Optimal Routing and Instant Re-Routing
Our analysis of both simulated and actual Amazon routes indicates that DDP successfully balances optimality, feasibility, and execution time, particularly when the number of drop-off points is substantial (e.g., N ≥ 150).
- Optimality: This refers to the travel time during delivery. DDP demonstrates superior optimality compared to traditional TSP solvers for TSP-time window (TW) challenges within a fixed execution timeframe.
- Feasibility: This metric measures the percentage of nodes that comply with their time window constraints. DDP outperforms traditional TSP solvers in meeting these constraints.
- Time: This pertains to the execution duration for the application. For instance, DDP can resolve a fully-constrained, 100-node TSP-TW problem (i.e., with 99 narrow time windows) in under 1 second (on a single Tesla T4 GPU on an AWS G4dn instance) while achieving around 26 percent feasibility and a 12 percent optimality gap. In contrast, a baseline conventional TSP-TW solver reaches a mere 3 percent feasibility and a staggering 604 percent optimality gap in the same scenario with a 60-second execution limit.
We conducted 160 evaluations of DDP using various TSP instances, utilizing an existing open-source TSP solver as a benchmark. The results reveal that as the number of stops increases, the baseline solver struggles to maintain a functional level of optimality and feasibility. Conversely, both variants of DDP (featuring two distinct configurations: DDP-greedy and DDP-rollout-pi) performed admirably across all three evaluation metrics. Notably, when analyzing 150 nodes, DDP-greedy not only emerged as the quickest option but also exhibited the highest feasibility, comparable to DDP-rollout-pi but with greater stability (a smaller error margin). DDP-rollout-pi, however, showcased an edge in optimality due to its refined value function approximation during rollout.
For additional insights on optimizing your financial strategies, especially during uncertain times, check out another interesting blog post from us here. For those interested in the intersection of AI and employment, you may want to explore this authoritative resource here. Lastly, if you’re looking for firsthand experiences regarding onboarding, this resource is excellent.
Leave a Reply