Wednesday, April 21, 2010

The data deluge

One of the most provocative things a privacy geek can say is "data minimization is dying". Data minimization has been one of the foundations of traditional privacy-think. The idea is basic and appealing: privacy is better protected when less data is collected, when less data is shared, when data is kept for shorter periods of time. This explains the endless debates in privacy circles about how many months computer or phone logs or passenger-name records should be retained, as though a numbers game about retention was the key issue in privacy. It isn't, but a debate over numbers is simple and appealing, and can be relayed by the press in a simple manner.

But whether you like it or not, we're entering an age of data ubiquity. Clearly, technology trends are making this possible, computing power, storage capacity, Internet transmissions have all allowed this to happen. And like all trends in technology, it will have good and bad applications: the same ease of transmission of data that enables billions of people to access information from around the globe makes it easy to transmit malicious viruses as well.

Statistics about the scale of the data deluge are indeed sobering, even if they reflect scales that human brains can't really understand. There are over a trillion web pages now, growing by billions per day. I read that there are now over 40 billion photos on Facebook alone. YouTube users upload over 24 hours of video every minute. The Economist reported that the total amount of data in the world is growing by 60% per year. No matter where you turn on the web, the scale of data growth is stunning. Even if you find concrete steps to advance data minimization, you're just taking a few drops out of the ocean of the data deluge.

There's no doubt that the Information Age is doing a lot of great stuff with this data deluge. It's also true that this data deluge is posing unprecedented challenges to privacy. I've struggled with this conundrum for many years. I don't think there's a better solution than trying to create maximum transparency and putting control over data back into people's hands, as best as possible. Trying to stop the data deluge is either Sisyphean or chimerical. But trying to decide on behalf of people also undermines the fundamental dignity and choice that each individual should be able to exercize over his/her own data. Of course, not all people can or will exercize responsible control over their own data. But putting transparency and control into users' hands is much like democracy. It fundamentally empowers the individual to make choices and trade-offs about data: making choices between data benefits and privacy. It's not perfect, of course, but it's still better than putting someone else (like governments or companies) in charge of those decisions. I think companies, governments and privacy professionals should define success foremost by whether we contribute to putting people in charge of their own data. As Churchill said: It has been said that democracy is the worst form of government except all the others that have been tried.


Álvaro Del Hoyo said...


People is already in charge cause they have to exercise their right to privacy, what is a responsibility at the same time, but giving empowerment to them through privacy enhancing technologies will put them to try to stop data deluge, what as you said will be Sisyphean or chimerical.

Due to this data deluge period we are living and increasing everyday, What will be chimerical in any case is anonymization and that's why minimum retention period is important, even when it could be very difficult to determine.

Data processors are responsible of what they choose to treat, so you have to be transparent, minimize amount of data, and treat it faithful, secure and lawfully.

How many people is using Google services around the world and how many of them are using Google Dashboard on a regular basis?

In any case, without prejudice some big and minor mistakes, I prefer Google's privacy management much more than other providers management style.

Please, keep working hard on privacy as until now. Really appreciate all your efforts, even when am not agree always with your approach.


Bonnie Yu 余巧 妍 said...

Completely agreed.

Privacy vs. data sharing is a fine line. Even in our everyday lives (without technology) we need to be careful about we say - how much and what to share with your acquaintances, friends, boss.

It's a judgement that we need to make constantly.

I like the sharing of data because we can combine our collective knowledge and solve problems better.

At the same time, I do recognize there should be some education for people e.g. don't tell people where you live and when you're not at home.