As self-driving cars become more advanced with a greater number of onboard computers, sensors, cameras and WiFi, the amount of data is expected to balloon, providing automakers, insurers and others with rich information to harvest.
A single autonomous car could generate as much as 100GB of data every second, said Barclays analyst Brian Johnson, in a note published Wednesday.
If extrapolated out to the entire U.S. fleet of vehicles -- 260 million in number -- autonomous cars and trucks could potentially produce about 5,800 exabytes, Johnson stated.
In other words, on a daily basis, there would be enough raw data to fill 1.4 million Amazon AWS "Snowmobile" mobile data center tractor-trailer trucks with 100 petabytes of storage each, for a convey reaching 11,000 miles long.
"Even with data compression of 10,000x, that would still be a one-mile long convey," Johnson stated.
Big data will be "at the core of change and disruption" in the auto business, and managing massive amounts of data will require new solutions in storage and analysis, the report said.
Security will also be a key area of concern for autonomous car makers. A modern car has 50 to 150 electronic control units (ECUs) - or tiny computers -- with as much as 100 million lines of code. And for every 1,000 lines there are as many as 15 bugs that are potential doors for would-be hackers, analysts say.
In today's vehicles, ECUs are linked by an internal controller area network, infotainment systems and an increasing array of cameras and radars for advanced driver assistance systems that are already creating vast amounts of data that is typically used by automakers, but then discarded.
"Going forward, any or all of this data can be uploaded to the cloud -- either in real time via 4G and beyond, someday through home WiFi uploads, or sporadically through service center uploads (as with the original Tesla Roadster," the report stated.
Cars will produce data related to in-vehicle, environmental, and driver/passenger information. In-vehicle data will consist of historical data, such as vehicle fluid levels, speed and acceleration, GPS positioning and, in the event of an accident, a snapshot of data prior to the crash as well as alerts for first responders.
Driver/passenger data will include information about the use of infotainment systems, HVAC and seat preferences, and even driving styles (i.e., whether the car is used in a "sporty" fashion versus economic driving).
"All of this could be recorded, uploaded and used to tailor in-car experiences," the report stated.
Environmental data will include information from LiDAR scanners, cameras and other sensors.
"The car can become a roving data gathering vacuum," Johnson said in the report. "Think of millions of Google StreetView vehicles capable of refreshing live views of every street everywhere several times a day. Not only can this data be added as layers on top of traditional HD-maps in near-real time, it can also be potentially mined for a variety of insights."
For example, video data could be used to determine how full a store parking lot is at any given time of day and what prices are advertised in a store window, according to Johnson.
"Moreover, installed cameras can displace the aftermarket dash-cam video market and record pre-crash images," Johnson wrote.
Companies most likely to capitalize on vehicle big data are automakers that are building from scratch, such as Tesla, which offers a "clean sheet in-car architecture with a solid base of data," and third-party parts suppliers, such as Delphi, which is expected to provide analytics engines for legacy and newly manufactured cars and trucks, Barclays said. Mobileye, a maker of vehicle vision chip technology (called EyeQ), also has a strong lead in the mapping and camera sensing market, the report claimed.
Intel recently acquired Mobileye for $15.3 billion to help advance an alliance between the two companies and BMW, which plans to ship self-driving cars by 2021.
Last year, Intel CEO Brian Krzanich emphasized how critical the automotive market has become to the company, and said the industry must be prepared for the deluge of data that will require an "unprecedented" level of "computing, intelligence and connectivity."
Krzanich said there's a need for the auto industry to be prepared for that data deluge that could amount to 4TB of data being generated from a single car each day.
Intel's investment arm, Intel Capital, also plans to spend $250 million of additional new capital over the next two years for the development of autonomous driving technology.
Intel has partnered with self-driving technology makers, such as vehicle camera company Mobileye, and carmakers such as BMW, to produce fully-autonomous vehicles by 2021.
While the oft-cited use case for automotive big data is in support of location-based marketing, Barclays believes the mountain of data will be mined for vehicle-related services, such as usage-based insurance plans and predicting required maintenance. The highest-value use case, however, would be to support level 4 and 5 fully autonomous driving through digital maps and sensor data videos, which would provide training data for autonomous algorithms.
"Indeed, within mapping, Auto Big Data could support crowd-sourced live video covering every mile of road in the world," Johnson wrote. "While currently it is not feasible to stream constant video to the cloud, in the future it would be feasible as bandwidth and storage costs drop exponentially."
The Society of Automotive Engineers International, a U.S.-based industry standards organization, has established six autonomous driving categories where Level 0 represents no automation and Level 5 is a fully autonomous vehicle that controls all aspects of driving previously performed by humans.
The exabytes of data expected to come will reside mainly in the cloud, where data analytics and cloud storage platforms will be required to create and store useable information. Those who capitalize on the Auto big data will be able to compress the data and extract key data "events," Johnson stated.
"Even in a world beyond 5G, the amount of raw data is impractical to process, so... the node and edge analytics engines in the car could become more important than the motors in vehicles," Johnson wrote.
By 2020, 75% of the world's cars will be connected to the internet via embedded Wi-Fi, and the growth of internet-connected vehicles will bring in around $2.94 billion in revenue, according to a report by Topology, a division of TrendForce market research.
In addition, autonomous or fully self-driving vehicles will enter mass production by 2020 because more major auto makers in recent years have committed to those vehicles' R&D, according to Topology.
Automakers are already investing heavily in artificial intelligence technology to control self-driving cars. For example, Ford plans to spend $1 billion over the next five years on A.I. in support of the development of autonomous vehicle technology.
"A common saying in Silicon Valley... is that 'data is the new oil' -- and enthusiasm for businesses that generate and analyze data is common across the technology space," Johnson wrote. "Unlike oil, ultimately a finite and diminishing resource, data and uses of data expand exponentially."
Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.