Thursday, April 16, 2009

The Cloud: policy consequences for privacy when data no longer has a clear location

Cloud Computing has become one of the more influential tech trends of our day. The Cloud is roughly analogous to remote computing, where computing and storage move away from your personal device to servers run by companies. A simple example might be online photo albums, which allow users to move their pictures off personal computers and into a secure and accessible space on the Web. Some Cloud services, like Hotmail, have been around for roughly a decade. And others have appeared since; almost all of Google's services, for example, run in the Cloud. As these services become more widely used, it's important to ask how our privacy laws and regimes should deal with this new phenomenon.

Some privacy laws, such as in the EU Directive, base regulation in part on the location of data. If data is in the Cloud, where exactly is that? Data in the Cloud exists within the physical infrastructure of the Internet, in other words, on the servers of the companies offering these services, as well as on users’ own machines. Cloud services are built on the concept that data held in the Cloud enables users to access and share data from anywhere, anytime and from any Internet-enabled device.

To know the “location” of data in the Cloud, you’d need to understand the architecture of data centers, among other things. Some companies like Google have data centers in multiple locations. A data center is a building that houses many, many, computers-- not too different from the ones you may have in your home. Companies try to pick places that, among other things, have a skilled workforce, reasonable local business regulation and are near low-cost and abundant sources of electricity. They tend not to provide too many specific details about these data centers, for a couple reasons. First, the data center industry is highly competitive and companies try not to disclose too many details that may give competitors a leg up. Second, knowing that users' personal information is stored in these computers, companies take the privacy and security of this data seriously and ensure that these buildings are well secured so that no one could just walk out with a computer holding your credit card information. The geographical location of data centers can be optimized to enhance the speed of a service, e.g., serving European users from a European data center might be faster than having the data cross the Atlantic. Finally, having data centers in different locations allows companies to optimize computing power, automatically shifting work from one location to another, depending on how busy the machines are.

Moreover, cloud applications are architected not to lose users’ data and to respond to queries quickly. Applications therefore usually replicate users’ data in more than one place. No Internet user would be happy if they lost access to all their email or calendar information, for example, just because the power goes out in some data center location. Applications may dynamically load balance their users among different data centers, so that the location of a particular user's data may change over time.

For all these reasons, it’s actually very hard to answer the apparently simple question: “where’s my data?” Indeed, it's becoming problematic that existing EU data protection laws were largely written in an era when data had an easily-identifiable location. For example, EU laws impose restrictions on the transfer of personal data outside the EU to any jurisdiction where there is not "adequate" data protection. In the past, "transfer" was defined as the physical shipment of data, such as sending a computer tape or paper files to an office in a faraway location. However, nowadays almost any activity on the Internet involves a transfer of data outside of the EU. Sending a document to a colleague in New York, for example, can technically be considered a transfer of material outside of the EU. In today's era of connectivity, strict and literal application of these laws would cause more than just a headache for companies and regulators: it would cause the Internet to shut down.

In this Internet age, when data flows around the planet at the click of a mouse, everyone agrees we need to identify a better model of privacy protections. Data doesn't start and stop at national borders when it travels on the Information Super-highway. From a privacy perspective, the important question is not “where is my data?”, but rather “who holds my data, and what are their privacy policies?" For a user, the important thing is to research and understand the data protection policies of the company which holds the data, regardless of its location.

I’ve looked at various laws around the world, and I’m impressed by the far-sighted model adopted in Canada’s privacy laws. I can’t do better than just quote the Office of the Privacy Commissioner:

http://www.privcom.gc.ca/information/guide/2009/gl_dab_090127_e.asp

"European Union member states have passed laws prohibiting the transfer of personal information to another jurisdiction unless the European Commission has determined that the other jurisdiction offers "adequate" protection for personal information. In contrast to this state-to-state approach, Canada has, through PIPEDA, chosen an organization-to-organization approach that is not based on the concept of adequacy… [U]nder PIPEDA, organizations are held accountable for the protection of personal information transfers under each individual outsourcing arrangement…

Regardless of where the information is being processed - whether in Canada or in a foreign country - the organization must take all reasonable steps to protect it from unauthorized uses and disclosures while it is in the hands of the third party processor. The organization must be satisfied that the third party has policies and processes in place, including training for its staff and effective security measures, to ensure that the information in its care is properly safeguarded at all times. ... [O]rganizations must in their own best interests, as well as those of their customers, do what they can to protect the information."

Canada’s approach works to preserve privacy protections, and to hold data collectors accountable for privacy protections regardless of the location of data. Canada has blazed a trail that will help guide us in the age of the Cloud.