Over the past decade, various companies have made significant strides in developing autonomous vehicle (AV) systems utilizing deep neural networks (DNNs). These advancements have transitioned from basic rule-based systems to sophisticated Advanced Driver Assistance Systems (ADAS) and fully autonomous vehicles. The training of these systems demands vast amounts of data and considerable computational resources, often requiring petabytes of information and thousands of virtual CPU and GPU units.
This article discusses different development strategies, the essential functional components of ADAS, and the design methodologies for constructing a modular pipeline while addressing the challenges associated with building an ADAS framework.
DNN Training Methods and Design
AV systems primarily rely on deep neural networks. There are two principal methodologies for designing an AV system, differentiated by the training and architectural boundaries of the DNNs:
- Modular Training – This approach divides the system into discrete functional units, such as perception, localization, prediction, and planning. Many AV system vendors adopt this design paradigm, allowing for the independent construction and training of each module.
- End-to-End Training – This method involves training a single DNN model that accepts raw sensor data and produces driving commands. This monolithic design is predominantly explored by researchers and typically employs reinforcement learning (RL) based on a reward/penalty system or imitation learning (IL) through observation of human drivers. While the overall design is straightforward, diagnosing issues within the monolithic structure can be challenging. However, annotations are often less expensive since the system learns from data derived from human behavior.
Additionally, researchers are investigating a hybrid approach that involves training two distinct DNNs linked by an intermediate representation.
This article focuses on functions derived from a modular pipeline approach.
Automation Levels
The SAE International J3016 standard delineates six levels of driving automation, ranging from Level 0 (no automation) to Level 5 (full driving automation), as outlined in the following table:
Level | Name | Feature |
---|---|---|
0 | No Driving Automation | Human drives |
1 | Driving Assistance | Human drives |
2 | Partial Driving Automation | Human drives |
3 | Conditional Driving Automation | System drives with human as backup |
4 | High Driving Automation | System drives |
5 | Full Driving Automation | System drives |
Modular Functions
The following diagram illustrates an overview of a modular function design.
At the higher levels of automation (Level 2 and beyond), the AD system performs a variety of functions:
- Data Collection – The AV system captures real-time information about its environment with high precision. Equipped with multiple devices, the functions of these components can vary and overlap significantly. The AV space is still evolving, and there is currently no consensus on the types of sensors and devices used. In addition to the commonly listed devices, vehicles may also integrate GPS for navigation and utilize maps and Inertial Measurement Units (IMUs) to measure linear and angular acceleration. Depending on the ADAS system, you may encounter a combination of the following devices:
- Cameras – Visual devices akin to human perception, offering high resolution but struggling with depth estimation and extreme weather conditions.
- LiDAR – High-cost devices that generate a 3D point cloud of the surroundings, enabling accurate depth and speed estimation.
- Ultrasonics – Compact, cost-effective sensors that are effective at short ranges.
- Radar – Effective over long and short distances, performing well under low visibility and harsh weather.
- Data Fusion – The AV system synthesizes signals from various devices, each with its limitations, to create a comprehensive perception of the environment. This integrated dataset is then utilized to train the DNN.
- Perception – AV systems interpret the raw data obtained from the sensors to gather information about the vehicle’s surroundings, identifying obstacles, traffic signs, and other entities. This process, known as road scene perception, encompasses object detection and classification.
- Localization and Mapping – To ensure safe operation, AV systems must ascertain the positions of detected objects. They create a 3D map that tracks both the ego vehicle and its surroundings, predicting the motion of detected moving objects.
- Prediction – Leveraging data from other modules, AV systems forecast changes in the environment. The DNN predicts the ego vehicle’s position and potential interactions with surrounding objects, identifying possible traffic violations and collisions.
- Path Planning – This function outlines potential routes based on inputs from perception, localization, and prediction. The AV system considers localization data, maps, GPS inputs, and predictions to determine the optimal route, prioritizing driver comfort and safety.
- Control and Execution – This module executes the planned route by controlling acceleration, deceleration, and steering, aiming to adhere to the planned trajectory.
- Training Pipeline – The DNNs that inform vehicle predictions require extensive training, typically conducted offline with data collected from the vehicles. Training demands substantial computational resources over prolonged periods, with the volume of data and computational needs varying by model architecture and AV provider. Providers commonly utilize labeled data that is partially annotated by humans and partially automated, while measures are taken to anonymize personally identifiable information (PII). Many also enhance labeled data with simulation techniques, facilitating data generation for specific scenarios. For further insights on this topic, check out this other blog post here.
As an authority on this subject, they offer valuable perspectives and expertise.
For those interested in employee integration processes, this resource provides excellent guidance.
Amazon IXD – VGT2, 6401 E Howdy Wells Ave, Las Vegas, NV 89115
Leave a Reply