For self-driving-car developers, like many iPhone and Google Photos users, the growing cost of storing files in the cloud has become a headache.
Earlier, robocar companies continued with a brute-force approach to maximize miles and data. “We can get all the data that cars see over time, the hundreds of thousands of pedestrians, cyclists, and vehicles, [and] take from that a model of how we expect them to work,” said Chris Urmson, an early leader of Google’s self-driving project, in a 2015 TED Talk.
Urmson is speaking at a time when autonomous car prototypes are relatively small and some of the companies testing them are able to store almost every data point they capture from the road. But nearly a decade later, Google’s project and many others have fallen far short of their own timeline predictions for success. Growing fleets, better sensors, and tighter budgets are forcing companies working on robotaxi and robofreight services to be more selective about what stays on their servers.
The newfound restraint is a sign of maturity for an industry that is starting to move people and things without drivers in some cities when the weather is good and the streets are relatively clear. , but has yet to make a profit. Knowing which data to keep and which to discard will be key to expanding service to more locations as companies train their technology on the nuances of new areas.
“Having tons and tons of more data is valuable to some extent,” said Andrew Chatham, who manages the computer infrastructure at Google driverless tech spinout Waymo. “But at some point, having more interesting data is important.” Rivals including Aurora, Cruise, Motional, and TuSimple also continue to guard their data stores.
The trend could spread at a time when driverless projects face pressure to control spending after years of losses. Companies from General Motors, which owns robotaxi service Cruise, to Waymo-owned Alphabet are in the midst of extensive cost cuts this year—including mass layoffs—as sales in core businesses is slow due to a shaky economy. Meanwhile, cheap and easy funding is drying up for autonomous car startups.
Of course, all spending is scrutinized. Amazon Web Services charges about 2 cents per gigabyte per month for its popular S3 cloud storage service, a price that can easily add up for data-intensive projects, and double in some cases when factoring of data transfer bandwidth costs. Intel estimated in 2016 that each autonomous car would generate 4,000 gigabytes of data per day, a volume that would cost about $350,000 to store for a year at Amazon’s current prices.
Chucking data can be bad for the tech industry. Companies like Google and Meta have long been mocked and even punished for collecting everything they can—including users’ locations, clicks, and searches—with the idea that greater understanding of behavior leads to better designed services. The mantra creates a culture of data collection despite any obvious application. For example, Google CEO Sundar Pichai acknowledged in 2019 that “only a small subset of data helps serve ads.”
Self-driving-car developers initially had the same philosophy of maximizing data. They produce video from arrays of cameras inside and outside vehicles, audio recordings from microphones, point clouds that map objects in space from lidar and radar, diagnostic readings from parts of vehicle, GPS readings, and more.
Some believe that the more data collected, the smarter the self-driving system will be, said Brady Wang, who studies automotive technologies at market researcher Counterpoint. But the method doesn’t always work because the volume and complexity of the data make them difficult to organize and understand, Wang said.