How do you measure privacy protections? There are many important questions that I ask, including these:
What data is collected?
Who has access to this data?
How is this data used?
Is this data transferred to third-parties?
Can the data subject see and control this data?
Is this data protected by adequate security safeguards?
How long is this data retained before it is either destroyed or anonymized?
In reviewing this list, I think the last one is the least important in terms of measuring meaningful privacy protections for data. But curiously, it's precisely this one that I hear the most as I move around Continental Europe listening to privacy media and regulatory concerns in the online debates in recent years. Why is that?
European privacy law has clear provisions that personal data should not be retained "longer than necessary". Naturally, this time period is left vague in the laws, since it would be impossible to prescribe precise time periods for myriads of different contexts, especially since retention is always justified by "legitimate purposes". I think there's a temptation to try to boil privacy down into something simple and numerical, and what could be simpler and more measurable than a time period? In practice, there's a vast spectrum of legitimate retention periods, even for similar services, if the retention periods were designed to respect the very different legitimate purposes for which they were retaining data. To take some Google services as examples: Search logs (9 months), Instant Search logs (2 weeks), Suggest logs (24 hours), etc. To me, it's absurd to think that the most important privacy issue in Search is whether Search logs are retained for 6 or 9 months.
To take a different example: data retention rules in Europe (for government and law enforcement access) range from 6 months to 24 months, with each country in Europe picking and debating different time periods. Germany for example picked 6 months (but the German Constitutional Court struck down its version of data retention on other grounds), while France picked 12.
Curiously, the time dimension of data retention is almost entirely a Continental European privacy concern. It rarely registers as a meaningful vector in other countries, even in countries with very intense privacy debates. Of course, the euro-time-period debate is also intimately tied up with the debate about the so-called "right to be forgotten", the "droit a l'oubli", a well-intentioned idea that people should somehow be able to have parts of their own past (presumably the disagreeable parts) edited out of their personal histories. And, not coincidentally, this debate is most intense in countries with historical chapters that many people consciously or unconsciously want to forget: like Spanish society's conflict between remembering or forgetting the crimes of the Franco era.
I've spent a fair amount of time engaging in the time period debate "how many months is ok." It's pretty repetitive after a while. Lots of people who can't be bothered to think about the issues will just say: "oh, that's too long". I strongly believe that personal data should not be retained for "longer than necessary", as required by European privacy law, and I generally believe that it's an important debate for data controllers to justify their retention according to "legitimate purposes". Beyond that, reducing the online privacy debate to a numbers' game risks focusing all the attention on only one aspect of the broader privacy debate (and in my opinion, on the least important aspect of the debate to boot). And I am very much not in the superficial privacy school of thinking that "shorter is always better".
To clear my head, I spent some time playing tennis this summer. Now that's a number's game. By the way, I lost.