AV startup Compound Eye created a new 3-D perception technology that allows autonomous vehicles and robots to experience the world in three dimensions. CloudFactory expanded the team’s labeling capacity and cut labeling costs 33% by labeling and classifying the 2-D and 3-D frames and images that power the technology’s software.
Consumer interest in autonomous vehicles is soaring. And AI companies within the AV industry are working feverishly to solve some of the toughest challenges of autonomous driving, including issues around sensing and perception. Think of autonomous cars that suddenly brake when confused by shadows. And cars that swerve, to catastrophic end, when running into a blown tire on the freeway.
Compound Eye, a 3-D vision company based in California, has solved some of these perception problems with VIDAS, which stands for visual inertial distributed aperture system, a new 3-D perception technology that uses automotive-grade cameras to let robots and autonomous vehicles experience the world as humans do, in three dimensions.
Sensors like LiDAR and radar can also make the 3-D experience possible, although with limitations. “If you turn to the owner’s manual of any L2 car, you’ll find a litany of limitations for your L2 system,” says Jason Devitt, Compound Eye’s co-founder and CEO. “It won’t recognize certain types of objects. It won’t stop for things in the middle of the road. It won’t stop for a vehicle butting into your lane. It might lose contact with the vehicle ahead of you as you go up and down hills or around steep curves,” he says.
Why all the won’ts and mights? Director of Product and Head of Marketing Tarani Duncan says the standard approach uses LiDAR returns for depth estimates and 2-D images to understand a scene. “Many annotations likely happen only in 2-D because available 3-D data from LiDAR is too sparse to annotate,” she says. Jason says this lack of information can confuse autonomous systems, which is where VIDAS comes in. “The dense 3-D information we’re bringing to the industry leads to properly designed systems that use parallax, semantic cues, and neural networks,” he says.
VIDAS delivers 10 times the resolution and range of most LiDAR sensors and consumes less power. In a single framework, the advanced perception platform combines parallax and semantic cues, which offer different and complementary information to provide accurate depth and semantic class at every pixel, all in real time.
Although Compound Eye is breaking new ground in the AV industry, the startup needed a solid labeling process and tool to build its software, train its models, and leave stealth mode. The time, effort, and data requirements for this work are enormous, as each frame and image for each camera needs detailed labeling and classification, including pixel-level segmentation.
After researching more than 200 commercially available annotation tools, the team found that most were built for sparse 3D datasets. Instead of buying off the shelf, they decided to build a tool to power their state-of-the-art perception platform.
But even with this valuable resource, the company’s small team was still constrained by in-house capacity. And they didn’t want to spend time on tedious annotation tasks; they wanted to focus on the company’s mission of building a full 3-D perception solution using cameras. Compound Eye tried to outsource the annotation work to other vendors but, due to poor quality, high costs, and restrictive tooling, decided not to do so.
CloudFactory has played an integral role in advancing VIDAS, our perception technology. We would not have made such critical progress without our partnership.
Then, in 2020, the team discovered CloudFactory and its experience working in client-created tools. That experience coupled with CloudFactory’s AV expertise and customizable workflows proved the right match for Compound Eye’s outsourced labeling and classification work.
To ensure quality and consistency, CloudFactory optimized a custom annotation approach for Compound Eye. Using Compound Eye’s tool, CloudFactory data analysts annotate dense RGB point clouds and 2-D frames. Analysts place 3-D cuboids in the point cloud to identify transient objects, like moving vehicles and pedestrians. They also assign semantic classes to each pixel in 2-D frames, including sidewalks, drivable spaces, and even off-road semantic classes, like dirt, rocks, grass, and bushes.
“Other third parties produced far less accurate results than CloudFactory labelers generate for us,” says Tarani.
In addition to image annotation work, CloudFactory also acts as an annotation advisor, recommending new software and tooling features that help analysts work as efficiently as possible and accelerate model development. Engineer Scott Wu says this consultative approach is critical to Compound Eye’s mission of creating safe autonomous solutions. “CloudFactory regularly consults on the user interface of our annotation tool, which has led to high-impact UX changes, efficiency improvements, and overall model accuracy,” he says.
Today, Compound Eye is out of stealth mode. The company is working with OEMs and Tier 1 suppliers around the world, as well as with the U.S. Army.
"CloudFactory has played an integral role in advancing VIDAS, Compound Eye’s perception technology,” says Tarani. “We would not have made such critical progress without our partnership.” She also says that CloudFactory annotators continue to deliver 2-D and 3-D annotations that outperform industry-leading annotation platforms. “By partnering with CloudFactory, we generate highly accurate per-pixel semantic labels for a fraction of what other third-party providers charge,” she says, pointing out that Compound Eye has since decreased its cost per frame by 33%.
Adam Flaum, Compound Eye’s operations director, also appreciates the partnership. “Working with CloudFactory is like working with an extension of my own team,” he says. And now, instead of worrying about how to quickly and accurately annotate growing numbers of images and frames, Compound Eye focuses on what it does best: Building 3-D representations of what vehicles see and teaching machines to view the world as humans do.
Although 3-D perception is complex and challenging to replicate, Compound Eye’s VIDAS technology, which brings superhuman vision to vehicles and robots, makes it possible. And there’s nothing else like it, which positions the company for the coming mass production of fully autonomous robots and vehicles and the large-scale creation and accelerated deployment of safe autonomous vehicles.
We have 10+ years of experience helping our clients focus on what matters most. See what we can do to help your business.
Your project’s success hinges on data quality and workforce strategy. This guide shows you how to use both to optimize your AV data pipeline.
To launch its new Hydra platform, the autonomous vehicle sensor company Luminar turned to us for high-quality, complex 3-D image annotation.