We are naturally inclined to think that our relationship with data develops solely in the world > data > use direction, in which data captures what happens in the world, and we use data to understand events in the world and hopefully predict future events.
By thinking like this, however, we lose sight of a complex mechanism of reciprocal interactions which, although it can perhaps be classified as secondary, is nonetheless essential for defining, supporting, and evolving that ecosystem made up of events, phenomena, their measurements, and decisions about them.
We must be aware that if, on the one hand, data measures what happens, on the other hand, the decisions we make on the basis of data, which define the actions we implement in the world (actions capable of changing the data’s trend) actually condition the subsequent measures captured by the data. Consequently, these subsequent decisions trigger a complex mechanism of action-reaction, which shapes the reality we live in, making it fluid and changeable, and making what is intimately connected with what we do.
A Data Ecology
Interaction is therefore key, and it leads to a much more complex level of interaction than we are led to imagine is possible. It leads to the definition of a complex ecosystem made up of elements that can be classified on different levels, and just as in a traditional ecology, they are essential for this ecosystem to be alive and healthy.
This ecosystem must obviously be respected, focusing on the entire process, from measurement to decision, because if it is true that data is a renewable resource, it is also true that from its collection, it requires resources and energy, creating an intersection between the data ecosystem and the world we live in, where what happens in one is reflected in the other.
Data collection should therefore be parsimonious, because if the feeling is that the data is there, ready to be taken, it is also true that doing it without an end is just a waste of resources, like stockpiling food if you do not plan to consume it.
Data storage should also be prudent, because, once again, copying and replicating data just because someone might need it is not exactly an environmentally friendly practice, given that each data warehouse would consume energy and require resources, which would also be needed to transport data from one warehouse to another.
A New Logistics
In other words, what we need is a new data logistics, which governs how to optimize existing data warehouses without creating new ones for the sole purpose of storing what is already stored elsewhere, and only shipping what must be delivered, when the data is actually requested.
This new logistics would not replace the existing logistics but enhance it, and as such, it should be used wherever it is convenient to do so. It would not be a matter of decommissioning all existing data warehouses but of evaluating whether or not a new need for integration, a new model, or even a new paradigm, must necessitate new construction. Rather, this new logistics would prompt one to think of further exploiting existing technology and to rely on a new model of connection between them, one that would be able to move data only when it makes sense to do so, and to do so with maximum efficiency, while reducing the resources necessary to deliver what is requested.
This type of environmentally friendly approach would be well supported by logical architectures for data integration, which would not only satisfy the exploratory needs of those who have to use data but would also be capable of delivering data only when those who need it are ready to use it.
A data ecology should therefore be based on the awareness that data is not anything in-and-of-itself but constitutes a way to digitally dialog with the world, representing it but also, through the actions that are based on this representation, influencing it, triggering that mechanism that I spoke about above. It is precisely this complex interaction that must adhere to principles of sustainability and respect for the available resources, because every digital world is never completely digital, given that what is digital will always be managed by what is physical.
A Broader Perspective
Recognizing the complexity of a data ecosystem and the mutual interactions that come to life within it, we should start thinking about a change that I dare say is ontological, one that moves us from an anthropocentric approach to data to a biocentric one.
- Data Integration: It’s not a Technological Challenge, but a Semantic Adventure - February 15, 2024
- Data Ecology - December 22, 2022
- Data Virtualization: Be the Driver, not the Mechanic - January 5, 2022