A new definition of truth?
I try to spend as much time as possible reading through papers about algorithms, data and information. I recently came across a piece called: “Le régime de vérité numérique. De la gouvernementalité algorithmique à un nouvel état de droit”. There is also an English translation which can be found here which states the title as: “The Digital Regime of Truth: From the Algorithmic Governmentality to a New Rule of Law”.
This is a lecture given by Antoinette Rouvroy in October 2014 about the impact of big data on our concept of truth, and how this may affect the choices we make and the agency we have.
Despite studying philosophy, I find it very difficult to understand texts on first reading. So I wanted to write an interpretation and summary of the lecture as I see it after reading it a couple of times. It’s also a fairly difficult translation*, so hopefully this is written in slightly plainer English, though may miss some of the nuances of what she says.
For me, the main takeaway is her insight about the way data is analysed in an age of ‘big data’ compared to before. Previously, scientists created mathematical models. These were hypothesised by people who have basic understanding of human desires and motivations – by virtue of being human themselves. These models were meant to be analogous to the physical world we live in. They were proved right or wrong according to the data that social scientists gathered about the world.
Models weren’t perfect. George Box famously said: “All models are wrong. But some are useful”. However, this acted as the gold standard scientific method for many years.
Compare this to the world of today. Now we collect a huge amount of data about actors, machines and the environment. So instead of devising models and hypotheses that can be proved successful or otherwise by the data sources, instead we just let the data ‘create hypotheses’ themselves. If we give the machines enough data points, then they can detect patterns that we may never have thought of.
It is like we are creating a parallel world. We believe that this data filled world is objective. We are able to uncover ‘truth’ about the world in a way that we have never managed to do before. And given we have created this parallel world with event sources that we can track, we believe we can create some objective certainty.
The Big Data Ideology
Chris Anderson wrote in Wired in 2008:
“This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.”
The End of Theory: The Data Deluge Makes the Scientific Method Obsolete
Seems like a good idea, right?
Well, no, argues Rouvroy. She believes that if we think like this, then all our notions of meaning and truth are thrown into crisis. This crisis she calls the ‘crisis of representation’. We doubt the knowledge that we create with our own human minds, the experiences that we have, the causation and correlations that we believe and experience.
In addition to this, the insight we get from a world of data is not the objective truth. This is because there are many parts of our human experience that cannot be reduced to specific data points nor events.
For example, you cannot create a data point for all the things a person chooses not to do. You cannot account for overwhelming emotion like misery nor what it feels like to act because of pity. You also cannot document the future using this data. All the data points we have will account for the immediate present and the past – we cannot account for all the possibilities of what can happen in the future.
What’s more, the data that is used to create this objective world is ‘cleaned’ to an extent. In order to make sure it is a data point that can be accurately interpreted, then often ambiguity from the data point is removed. So even the data which is in the ‘black box’ isn’t totally accurate nor representative of our real world.
This would be fine if we didn’t have such high expectations of data. These data sources are being used to summarise and hypothesise on the world around us. We believe that access to more data will given us a better understanding of the world. Often, what we mean by this is that we have a ‘more objective’ understanding of the world. If we believe that the ‘truth’ and ‘objectivity’ in the world can be governed by the data we mine, then this can be thought of as a new ideology. Rouvroy refers to this as a ‘big data ideology’: “each epoch has its own way to make the real manageable without it ever being accessible”.
So if the data reveals something about the world that we cannot understand without the data, then we believe this to be the actuality of the world. She states:
“The concept of truth is increasingly wrapped up at the expense of pure reality or pure actuality, to the extent that eventually things seem to be speaking by themselves.”
Disobedience and Human Agency
When it comes to big data, we have got excited about the abilities of these new technologies to make accurate predictions about the world. Therefore, our current actions will dictate what we believe the consequences of said actions will be in the future.
However, this would be terrible for human agency, argues Rouvroy. This takes away the possibility that people and objects could behave differently to what we believe might happen.
It is already the case that decisions about the future are made by analysing what we think might happen. We sentence criminals based on their likelihood to reoffend. There is often talk of life and health insurance premiums being directed on your likelihood to develop certain diseases.
However, when we start to base these decisions on the new ‘big data ideology’, we use this virtual world to make critical decisions which relate to humans and animals that are multi-faceted, and exist beyond the virtual world. This fiction causes decisions that will affect our lives.
Rouvroy states this is “by-passing subjectivity”. It takes away some of our potential disobedience to what the virtual world claimed was possible. To expand on this point, if we are governed by data and ‘intelligence logic’ then we begin to give credence to events that haven’t actually happened yet. We value the predictive power of the data source higher than what happens in our real world. As a result, we are moving from a penal logic to an intelligence logic, and this fully calls into question our ability to choose. It could be the data ideology that means we are totally powerless.
Why Am I Interested In This?
The area that I would like to improve is information access. At the moment, it feels like algorithms and the data sources we collect are influencing the decisions we make. I like to read more and more about the consequences of living in a world where the information we see is governed by algorithms. This is a critical paper to read.
There is a lot more to this lecture – I have to admit that I’ve read it a lot of times now and I still don’t feel that I’ve hit on the absolute crux of it. Nevertheless, I feel that thinking about data in this way can help us understand how and why it feels like we’re out of control of our environment. It could easily be because we care so much about simplifying our world that the truth of it is slipping away from us.
*If you are keen to read the translation – it may be helpful to know that ‘the actual’ may be a half translation of the French which is ‘actualitié’. This translates as ‘news’ too. I got most benefit from thinking about ‘the actual’ as events that can be recorded. For those who have experience in digital marketing then this terminology may make sense, I’d welcome a better translation!