Performing high-precision real-time inference is a challenging task, especially in low-visibility environments. With the NVIDIA Jetson embedded platform, the recently concluded Defense Advanced Research Projects Agency (DARPA) Subterranean Challenge (SubT) team was able to detect objects of interest with high accuracy and high throughput. In this post, we present the results, systems, and challenges faced by teams in the final leg of the systems competition.

The SubT Challenge is an international robotics competition organized and coordinated by DARPA. The competition encourages researchers to develop new ways for robots to map, navigate and search environments that present challenges such as low visibility, presence of hazards, unknown maps or poor communication infrastructure.

COVID-19 includes three preliminary circuit events: Tunnel Circuit, City Circuit, and Cave Circuit (cancelled due to the COVID-19 pandemic), and a final comprehensive challenge course. Each track and final is held in a different environment and different terrain. According to the event organizers, the competition will be held in 3 different stages, in September 2021 in KY. Louisville held its final game.

SubT Challenge competitors utilize NVIDIA technology for their hardware and software needs. The team used a desktop/server GPU to train a model deployed on a robot using the NVIDIA Jetson embedded platform to detect artifacts and objects of interest in real-time, the primary criterion for determining the winning team. Five out of seven competitors also use the Jetson platform for real-time object detection.

secondary challenge

The SubT challenge is inspired by real-world scenarios that first responders face during search and rescue operations or disaster response.

The state-of-the-art methods developed through this competition will help reduce the risk of casualties for search and rescue personnel and first responders as they explore unknown subterranean environments. Additionally, autonomous robots will assist crews in exploring the environment, finding survivors, objects of interest, and entering locations that are risky to humans.

figure 1 . The DARPA Subsurface Challenge explores innovative methods and new technologies for mapping, navigating, and searching complex subsurface environments. – Image courtesy of DARPA.

technical challenges

The game includes a variety of technical challenges, such as tackling unknown, unstructured and uneven terrain that some robots may not be able to maneuver easily.

These environments typically do not have any infrastructure to communicate with CENTCOM. The visibility of these environments is low from a perception perspective, and the robot must find artifacts and objects of interest.

Competing teams are tasked with addressing these challenges by developing novel sensor fusion methods, as well as developing new or modifying existing robotic platforms with different capabilities for locating and detecting objects of interest.


The CERBERUS team (Cooperative Walking and Flying Robots for Autonomous Exploration in Subterranean Environments) is a joint consortium of several universities and industrial organizations around the world.

The team entered the competition with four quadruped robots called ANYmal, five largely in-house manufactured drones with variable size and payload capabilities, and a roaming robot in the form of a supergiant robot. In the competition finals, the team ended up using four ANYmal robots and a super giant robot for exploration and artifact detection.

Each ANYmal robot is equipped with two CPU-based computers and an NVIDIA Jetson AGX Xavier. The rover robot is equipped with an NVIDIA GTX 1070 GPU.

The CERBERUS team uses an improved version of the You Only Look One (YOLO) model for object detection. The model is trained on 40,000 labeled images using two NVIDIA RTX 3090 GPUs.

The trained model was further optimized using TensorRT before being deployed on Jetson for real-time inference. The Jetson AGX Xavier is capable of inference at a collective frequency of 20 Hz. In the finals of the competition, the CERBERUS team took the lead in discovering 23 of the 40 artifacts in the environment and won the first place.

The CERBERUS team also used the GPU to draw terrain elevation maps and train the movement policy controller of the ANYmal quadruped robot. Use Jetson AGX Xavier to draw elevation maps in real time. The mobile policy training of the ANYmal robot in rough terrain is done offline using a desktop GPU.

Team co-star

Led by researchers at NASA's Jet Propulsion Laboratory (JPL) in Southern California and other university and industrial collaborators, the Teamwork Subterranean Autonomous Robot (Co STAR) has won a 2020 competition focused on exploring complex Underground city environment.

They also successfully competed in the 2021 class of mixed artificial and natural environments, placing fifth. The co-star team entered the competition with four locations, four husky robots and two drones.

In the final round, due to unexpected hardware issues, the team ended up using one Spot and three husky robots. Each robot is equipped with a CPU-based computer and an NVIDIA Jetson AGX Xavier.

For object detection, the team used RGB and thermal images. They used a medium variant of the YOLO v5 model to process high-resolution images for real-time inference. The team trained two different models to perform inference on captured RGB and thermal images.

The image-based model was trained using about 54,000 labeled frames, while the thermal image model was trained with about 2,400 labeled images. To train the model on their custom dataset, Team Co Star used a pre-trained YOLO V5 model on the COCO dataset and used the NVIDIA Transfer Learning Toolkit (called TAO Toolkit) for transfer learning.

The model was trained using two on-premises NVIDIA A100 GPUs and an AWS instance consisting of eight V100 GPUs. The team used TensorRT to prune the model before deploying it on Jetson AGX Xavier.

Using this setup, the team star was able to infer at 28 Hz from RGB images received by five RealSense cameras and images received by one thermal camera. In the last run, the robot was able to detect all 13 workpieces present in the designated area. Exploration time is limited due to delays in deployment due to unexpected hardware issues at the deployment site.

Equipped with NVIDIA Jetson platforms and NVIDIA GPU hardware, teams competing at the DARPA SUT event were able to efficiently train models for real-time inference, addressing the challenges posed by subterranean environments and accurate object detection.

About the author

Mitesh Patel is a developer relations manager at NVIDIA, where he works with higher education researchers to execute their ideas using the NVIDIA SDK and platform. Before joining NVIDIA, he was a Senior Research Scientist at Fuji Xerox Palo Alto Laboratories Ltd., where he worked on developing indoor localization technologies for applications such as asset tracking in hospitals and delivery truck tracking in manufacturing facilities. Mitesh received his PhD in Robotics from the Centre for Automated Systems (CAS) at the University of Technology Sydney, Australia in 2014.

Reviewing Editor: Guo Ting

Leave a Reply

Your email address will not be published.