Sometimes when talking about the data economy you have to come across "the four V of Big Data thinking" and since data is the raw material for understanding and the data economy, it is also good to understand what we are talking about when talking about the four Vs. The concept of "Big Data" may even be difficult to understand, but in short Big Data refers to a huge amount of data that is growing at a rapid pace, data is being used in many places at the same time, data is in many different forms, often collected without a holistic management model or where the data is coming from future use. As you can deduce from these few features describing Big Data and our previous blog posts, all slightly larger companies have Big Data.
But the focus now is to outline the Big Data context through four Vs: Volume, Velocity, Variety, and Veracity. There is some more useful Vs defined for Big Data but we think that these next four are the most crucial for managers to understand.
Volume: As its name implies, Big Data means a large amount of data, i.e. volume. As the amount of data in all channels is constantly increasing, the importance of this factor must be understood. When you think of all the IoT devices, emails, online services, documents, social platform services, and others that produce huge amounts of data every second, you get a good idea of why we are talking specifically about Big Data.
Velocity: as the name implies to us, this term encompasses an aspect related to the speed of data collection, production, and utilization. Which in turn also determines the volume of how much data you accumulate and how much of it can be utilized. As with volume, velocity is no substitute for quality, so Veracity should be considered in all Big Data thinking.
Variety: Data diversity refers to all the different types/formats, differently organized and modeled data that can be. This is perhaps one of the most difficult entities for managers to conceive but also at the same time one of the most interesting. For example, if you start thinking about the different file formats you may have; such as e-mails, images, Excel documents, etc. In addition to this, structured and unstructured data can be defined, i.e. structured databases or datamodels and unstructured data warehouses (there are also intermediate models of these semi-structured data storages). In addition, it is good to understand what information is media - such as videos or voice messages - and if the documents are different natural languages, where the same thing can be expressed in a number of different ways. In addition, the data can be classified into, for example, "Streaming data" or traditional data collection formats. And last, when all the different variations of these different factors are added together, a very multidimensional and whole is obtained, the management of which requires genuine competence and holistic management.
Veracity: In its simplicity, this means the correctness and truthfulness of the data and all the different variables that make up Veracity. We’ve talked about this a lot in other blog posts already, what happens if the data you collect isn’t reliable, it’s toxic, or at worst, you don’t even know what kind of data you’re using in your business or in a data economy from another supplier.
Many studies and Big Data concepts also elevate other V-factors into thinking such as Visualization, Virality, Value, and Viscosity. We think all of these can be useful ways to look at your own Big Data but it’s not appropriate to end up in a dogmatic debate about whether all possible factors have been considered without a real business need. For example, the modeling of data into a data product for sale or distribution can be a much more significant factor for the data economy than visualization. Likewise, taking care of data ownership and contracts is more important than Viscosity or Virality. But as said, all these can be extremely useful if your business need calls for it.