Data INTELLIGENCE vs. Data STORAGE (Part I):
For decades, we have been fascinated and engaged with collecting and storing data and performing analytics (extracting information from data) POST STORAGE. Today we need to think super-real-time-data intelligence!
The amount of data was nowhere close to what we have today, real-time analytics was not the primary concern, and analytics were performed by a small group of people with results shared much later, across a company. Today we are dealing with data levels exceptionally higher than before (and growing at astronomical rates) AND everyone wants/needs immediate access to analytics in super-real-time. This pattern is only going to be growing, and growing exponentially. The mechanics of storing data (putting it in a vault) though essential – as we need to keep copies of the data – can and should evolve dramatically in order to meet the data requirements of the hyper-connected world.
About ten years ago, I started work in the area of Edge Computing, in particular, the Intelligent Edge. Data Centers have the unique feature that they are often very close to the source of data. Hence, instead of only being a center to store data, data centers can evolve to be the center to provide critical and immediate intelligence about the data. In terms of AI, we need to move AI and analytics closer to the edge, enabling immediate analysis and reaction to data. Remember a very important fact: In the past, the individuals needing analytics and information were the employees of a company (any company), and the company decided who and how to share data outside of the walls of the company. Today, with the help of billions of devices and thousands of applications, everyone wants access to data and we are dealing with a highly connected digital world. To date, we have not seen such quantities of data.
Solution: Today, a large percentage of analytics are performed at data centers, by companies who are storing their data at the data center performing AI at the data center. But there is a lot more we can do. Imagine: Data centers begin to operate as the ‘edge’ closest to the source of incoming data, and run AI at the edge (Edge AI), which also could determine what data we keep vs not. In fact, data centers can operate as the pulse of the network vs the infrastructure provider of the data network. So, data centers begin to offer AI at the edge as a service in addition to the traditional services they offer. With the emergence of autonomous vehicles, drones, sensors, robots, and what the future digital and metaverse world is going to become, AI at the edge is critical.
Two factors that contribute heavily to this are:
- Running thin layers of AI, requiring minimal ML (TinyAI, TinyML), at the edge, to process/react to incoming data: Consequence: We are effectively training the edge to function effectively instead of having to run ML with large data sets in central servers.
- Homomorphic Encryption (HE) – which allows us to deal with encryption and key security/verification right at the edge. Consequence: We can deal with privacy and security issues and public/private keys right at the edge vs transporting the data from the edge to where data is stored and addressing post-analysis. My colleagues at IBM Research are doing great work in this area.
In my opinion, the combination of TinyAI/ML at the edge, Homomorphic Encryption at the edge, and rethinking, can greatly evolve our thinking of data from data collection to data intelligence. For the last century, we have been engaged in the very important task of ‘collecting’ and ‘storing’ data. The digital world ahead of us requires that we disrupt our mindset and be ready to evolve our thinking to intelligence vs storage, and I am a huge fan of evolving data centers to be the pulse of this dramatic and critical innovation. Stay tuned for Part II as I go deeper into this topic, sharing projections of data growth and taking a look at the landscape.
Disrupt | Innovate | Lead | Imagine