#25 Data Gravitation and decentralization

Data mass is beginning to exhibit gravitational properties – it’s getting heavy – and eventually, it will be too big to move. Gravitation is a natural phenomenon by which all things with mass, energy or data —including planets, stars, galaxies, and even light—are attracted to one another.

In the data economy, this leads to centralized data, but not in the form of data centers. This trend is supported by the emerging concept of data mesh. Data mesh provides an alternative to the centralized organizational and architectural pattern of the data lake with a distributed and decentralized architecture. I see a couple of related phenomenons occurring - data gravitation and decentralization.

Continue reading or watch the video

Data and learning are localized

Given that the data has gravitated into centers, machine learning and AI-powered agents will travel between the centers of data, learning and executing policy, without carrying copies of the data with them, only the learnings. In other words, data and learning are localized, and AI moves around making decisions and taking decisions within the geographical jurisdiction of the governing body that owns the data.

AI agents are one form of data as a service. Those are given a task to mine needed results from the data mass without human intervention or analysis. Of course, the data feed used by the AI agents must not be raw data. Instead, it can and most likely will be data products, which are preprocessed and optimized for AI. This leads us to the another phenomenon.

Data and processing are localized

The above is true given that you store all the raw data from billions of sensors creating new data points sometimes even multiple times in a second. The alternative is to preprocess the raw data near the source. This is where edge computing seems like a tempting approach.

According to Gartner edge computing doesn’t compete with cloud computing but will complement and complete it. Edge computing is part of a distributed computing topology in which information processing is located close to the physical location where things and people connect with the networked digital world.

Now with edge computing, we can calculate signals from the raw data and store that analyzed data as input for later use. This makes sense since a single data point has little value for the consumer or automation adjusting systems in real-time. The applied preprocessing lets us store less data for longer-term. This also reduces the need to transfer data over networks and thus reduces the network burden which is now suffering from bandwidth and latency issues.

These two phenomenons - data gravitation and edge computing - are not opposing forces, but can support and enhance each other for more efficient data value mining. Keep an eye on edge computing and data gravitation.