Tuesday, February 27, 2007

The Slippery Slope of Data Retention

The Article 29 Working Party issued a blunt Opinion in March 2006 about data retention: “The decision to retain communication data for the purpose of combating serious crime is an unprecedented one with a historical dimension. It encroaches into the daily life of every citizen and may endanger the fundamental values and freedoms all European citizens enjoy and cherish.”

The Working Party went on to make some concrete, practical recommendations for Member States to address when they implement the Directive. As someone who will likely be on the receiving end of law enforcement requests, and will likely struggle with the ambiguities of the law, I’d like to highlight four of their recommendations, all of which present slippery slopes indeed.

1) Since the Directive mandates retaining data for the purposes of investigating “serious crime”, that term should be defined. What is a “serious crime”? And which crimes are not “serious”? I’m sure terrorism and child pornography are “serious”. But is defamation “serious”? And if the law doesn’t define them, who are going to decide: law enforcement, or the companies receiving these orders, or independent arbiters?

2) The data should only be available to specifically designated law enforcement authorities. The Working Party opined that a list of such designated law enforcement authorities should be made public. In the absence of such a public list, I’m sure that lots of officials will make requests for data. To take just one European country, France are we talking about the gendarmerie, the police, the CRS, investigative magistrates, military personnel, diplomatic officials, or any of many other officials? And for companies dealing with cross-border issues, how else could companies know which officials are “designated” in 27 different countries, each with different languages and legal systems?

3) Investigations should not entail large-scale data-mining. But in practice, who is going to enforce limitations on data mining: the companies that refuse to provide large amounts of data? Google famously went to court to challenge a DOJ subpoena in the US for large amounts of data, but 34 other companies receiving requests from the DOJ around the same time did not.

4) Access should be authorized on a case by case basis by judicial authorities or other independent scrutiny. If this Working Party recommendation were implemented, it would indeed insert a level of independent review. In the absence of such a process, who ensures that the requests are indeed valid under the laws? It’s optimistic to assume that all the recipient companies in Europe will exercise independent scrutiny, and only answer the types of requests that a judge or independent authority would have authorized.

We’re on a slippery slope, and we need much clearer rules. Or, as W Somerset Maugham put it: “There are three rules for writing the novel. Unfortunately, no one knows what they are.”

Saturday, February 24, 2007

Raise your hands if you’re worried about Data Retention!

As Dan Quayle put it: “I believe we are on an irreversible trend toward more freedom and democracy – but that could change.”

It’s a flattering self-image in Europe to play Greece against the American Rome: confronting the clumsy boot of American government power with humanistic values like privacy. Witness the outrage in Europe over the transfers from Europe to the US of airline passenger name records or financial wire transfer data. And it has become common knowledge in Europe that the US Patriot Act sacrificed privacy and other civil rights in favor of the “war on terror.” It’s time for us to take a look in the mirror: at Europe’s own reaction to the terrorism in Madrid and London, the Data Retention Directive.

The goals of privacy and the goals of law enforcement are often in conflict in the best of times. In the worst of times, like the aftermath of terrorist strikes, politicians have taken a new look at the balance, and chosen to shift it away from privacy and towards the goals of law enforcement. The shock of a terrorist act is asymmetrical, moving the balance in one direction. The slow erosion of civil liberties hardly generates the shocks to move the balance back.

The Patriot Act is a grab bag of disparate measures, mostly meant to make it easier for law enforcement to access data to help them investigate terrorism. It’s a clumsy law, at best, and it over-rides many longstanding procedural safeguards to protect people’s privacy from the State. But it’s not a data retention law. It makes it easier for American law enforcement to get their hands on data, but it doesn’t impose an obligation for companies to retain data, in case law enforcement should someday want access to it. The EU Data Retention Directive takes the opposite approach: it imposes massive data retention obligations on companies in Europe to keep mountains of data in case law enforcement should someday decide to ask for it. You may disagree, but in terms of privacy, I think the Data Retention Directive is far worse than the Patriot Act: a law that mandates that you collect and maintain mountains of data for law enforcement is worse than a law that makes it easier for law enforcement to access pre-existing databases.

I doubt most Europeans realize that the Data Retention Directive will require that telco’s and Internet “electronic communications service providers” (e.g., email providers) store all their traffic data for between 6 and 24 months. And do Europeans realize that some governments are trying to push the balance even further away from privacy towards the goals of law enforcement than required by the Directive? The German Ministry of Justice has drafted a law to mandate that email providers in Germany must verify the identity of their email customers, to stop the use of anonymous email accounts. The Netherlands Ministry of Justice has proposed a requirement to retain location data for 18 months, going far beyond the requirements of the Directive.

This massive invasion of privacy would be easier to swallow if the “bad guys” couldn’t easily evade being tracked anyway. Very simple technical measures allow anyone to use the Internet without leaving the tracks that the Directive would try to retain. In fact, it might be as easy as using non-European-based service providers. Today, Google does not verify the identity of its email users, and I can’t imagine it would start to do so, whatever the German law might say. I’m hardly alone in believing that users should be entitled to anonymous email accounts, for lots of reasons, ranging from a philosophic belief in the right to be anonymous online, to practical reasons, like trying to protect one’s account from spam.

If you have read the privacy news over the last few months, you would get the impression that the biggest threat to the privacy of EU citizens resulted from the transfer of pieces of their personal data to the US government, either when they fly to the US (those passenger name records) or when they do a financial wire transfer (using the “SWIFT” network of banks). If there is so much distrust in Europe about the US government getting its hands on such relatively minor pieces of data, why aren’t more people in Europe worried about their own governments getting access to vastly more data about them? Really, what’s more troubling: allowing the US government to see passenger information about the people on a flight from Amsterdam to New York, or allowing the government of The Netherlands to mandate that the location of every person in the country be tracked and stored for 18 months every time they use the Internet or the phone?

EU governments are required to implement the provisions of the Data Retention Directive into their national laws by 2009. They’re just getting started now, and the early indications are not good if you care about privacy.

Monday, February 19, 2007

Your Data is in the “Cloud”

Henry David Thoreau was prescient again, when he wrote: “You must not blame me if I do talk to the clouds.” We’re all doing that now, even if Thoreau had more to say than most of us.

So, if your data is in the cloud, where exactly is that? The cloud is the data that exists within the physical infrastructure of the Internet. Web 2.0 services are built on the concept that data held in the cloud enables users to access and share data from anywhere, anytime and from any Internet-enabled device. The cloud exists on the servers of the companies offering these services, as well as on the browsers of users’ own devices. To know the “location” of your data, you’d need to understand the architecture of data centers.

Some companies like Google have very large data centers in multiple locations. A data center is simply a warehouse building with stacks of server computers. Companies try to pick places that are near cheap, reliable sources of electricity. They tend to prefer not to specify publicly the exact locations of these data centers, for a couple reasons. First, competitors are watching each others’ choice of data center locations. Second, strong security practices dictate that they be kept as low-profile as possible. Nonetheless, newspapers have written extensively about Google data center construction projects in Oregon and North Carolina, to name just two.

As a user of a Web 2.0 service, you expect your service provider not to lose your data and to respond to your queries quickly. Data centers therefore usually replicate users’ data in more than one place. Google users would not be happy if they lost all their data just because the power goes out in Oregon. And the geographical location of data centers can be optimized to enhance the speed of a service, e.g., serving European users from a European data center can be faster than having the data cross the Atlantic. Finally, having data centers in different locations allows companies to optimize computing power, automatically shifting work from one location to another, depending on how busy the machines are.

For all those reasons, it’s actually very hard to answer the apparently simple question: “where’s my data?” Yes, data protection law was largely written in an era when data did indeed have an easily-identifiable location. But, now, if you want to know how your data is being protected, the important question is not “where is my data?”, but rather “who holds my data?” and “what is the privacy policy being applied to my data?”

You can’t pin-point the location of the clouds, but you can still talk to them.

Tuesday, February 13, 2007

The Tangle of Cross-Border Law Enforcement Requests for Information

If you think the international mechanisms for cross-border law enforcement requests for information are clear, you’d be wrong.

The Internet is a global creature. A user in Country A can transmit, say, child pornography to an individual in Country B from a server in Country C. So, how does non-US law enforcement bearing non-US court orders for information get what they need from a US company to investigate that?

In the US, there is a cumbersome process that requires a non-US law enforcement entity first to contact the US Department of Justice’s Office of International Affairs (“OIA”). OIA then passes the non-US law enforcement official’s request to a US Attorneys Office. The US Attorneys Office can then apply to a US District Court to be designated as a Special Commissioner who can then act on behalf of the non-US law enforcement entity. In practice, this process can take many weeks or months.

Surely, this is a process that needs to be streamlined. Of course, international negotiations are tedious and slow, but the needs of cross-border law enforcement collaboration are going to increase, so continuing to live with an antiquated mechanism will only become more painful over time.

The other relevant US law, the Electronic Communications Privacy Act, is silent on the ability of US companies to disclose such information directly to non-US law enforcement bearing non-US court orders. Because the law is silent, some companies no doubt have decided to respond directly to non-US law enforcement requests based on non-US court orders. And other companies have no doubt concluded the opposite.

The Internet’s global dimension has vastly out-paced the provincial processes of cross-border law enforcement requests for information. And I assume that means that some of the bad guys aren’t getting caught.

Monday, February 12, 2007

Terrorists are using Google Earth?

The news have reported cases recently where Western military have raided terrorist lairs and found satellite images of sensitive sites from Google Earth. Governments are tasked with the awesome responsibility of protecting us from terrorist attacks. Sometimes, they turn to Google and ask for sensitive images to be removed or degraded. Is that the right approach?

First, some background. Google Earth is a digital globe on your personal computer. It combines satellite imagery, maps and Google search to bring the world's geographic information to your fingertips. I can still remember the first time I typed my home address into Google Earth and watched my computer screen zoom from space directly onto my home – even for someone used to technology, I just gasped. And I’m not alone. Google Earth is used by more than 100 million people.

Every user of Google Earth has his or her favorite examples, often including non-Google content called a “mash-up”. Here are mine. I watched the progress of the Tour de France across the lovely countryside of France. I saw the heart-breaking images of Banda Aceh before and after the tsunami, and learned that relief agencies used these images to plan their efforts. I studied the distribution patterns of avian flu and migration patterns of birds across the globe. And I look at my house and my neighborhood from the sky.

We all know that Google is working to help more people in more countries get access to more information. While we think about the security issues from giving people greater access to geographic data, we need to keep certain facts in mind: the imagery on Google Earth is not unique to Google. Google buys or acquires it from other companies. The imagery is not real-time, since the photographs are taken by satellites and aircrafts during the last three years and updated on a rolling basis. Commercial high-resolution satellite and aerial imagery of almost every country in the world is widely available from numerous different sources, and there are dozens of commercial satellite image providers in the world. Anyone who flies above or drives past a particular site can often get the same information. And several other sites, like Geoportail or MSN Virtual Earth, make similar satellite imagery available to their users.

The companies and governments that gather and distribute these
images are primarily responsible for addressing the security issues
they raise. And they sometimes address this problem by altering sensitive
images before distributing the data. Look at the center of The Hague, and you’ll see a building which has been erased from the image: Google posted the image the way it was received.

At the same time, it’s all too easy to image a slippery slope, where governments go too far in requesting that certain images be removed: should images be removed of disputed territories in the Kashmir? Of Israeli settlements on the West Bank? Of every British embassy around the world? Of entire regions of Russia? Of a politician’s holiday home? And which government and which department would decide which site is “sensitive”?

Governments control their airspace, and they can control which companies have the right to take aerial images, and to exclude certain zones. But satellite imagery from space is a different category. To take just one example, I think it’s a good thing that there are very detailed satellite images of North Korea on Google Earth, which the North Korean government would no doubt want to obscure.

Google has said publicly that it is always prepared to discuss security concerns directly with government officials. My personal view is that the right approach is generally not to change images, because I believe that more information gives people more choice, more freedom, and ultimately more power. And removing images from just one source is not a reliable basis for guaranteeing security.

Wednesday, February 7, 2007

Search Data: another conflict between Data Protection and Data Retention

Since the AOL incident, there has been a lot of discussion in privacy circles about the storage of search string data. The discussions generally focus on the time period during which such data is retained by the service provider, and whether or not data protection concepts should limit that time period. I have seen almost no discussion about whether or not the Data Retention Directive will require search string data it to be retained. So, again, we are seeing a conflict between data protection and data retention requirements. Here are a few thoughts.

What does a search engine like Google collect when a user conducts a search? Google explains this on its site:

“4. What are server logs?
Like most Web sites, our servers automatically record the page requests made when users visit our sites. These "server logs" typically include your web request, Internet Protocol address, browser type, browser language, the date and time of your request and one or more cookies that may uniquely identify your browser.
Here is an example of a typical log entry where the search is for "cars", followed by a breakdown of its parts: - 25/Mar/2003 10:15:32 - http://www.google.com/search?q=cars - Firefox 1.0.7; Windows NT 5.1 - 740674ce2123e969
• is the Internet Protocol address assigned to the user by the user's ISP; depending on the user's service, a different address may be assigned to the user by their service provider each time they connect to the Internet;
• 25/Mar/2003 10:15:32 is the date and time of the query;
• http://www.google.com/search?q=cars is the requested URL, including the search query;
• Firefox 1.0.7; Windows NT 5.1 is the browser and operating system being used; and
• 740674ce2123a969 is the unique cookie ID assigned to this particular computer the first time it visited Google. (Cookies can be deleted by users. If the user has deleted the cookie from the computer since the last time s/he visited Google, then it will be the unique cookie ID assigned to the user the next time s/he visits Google from that particular computer). “

So, every time a user conducts a search, a so-called “server log” is collected by the search engine. How does the new Data Retention Directive apply to this?

In 2006, the EU passed the Data Retention Directive, which obligates certain types of network operators to retain certain types of data for mandatory periods, in order to make them available on request to law enforcement agencies. The Directive applies to “providers of publicly available electronic communications services” and “public communications networks”, but these terms are interpreted differently in the various Member States that have to implement the Directive, which gives rise to questions of interpretation. For example, in France and Italy, it is expected that the implementation of the Directive will apply to Internet cafes, bars, restaurants, hotels, and airports, to the extent that they provide services such as public Internet terminals. On the other hand, preliminary discussions in other Member States, such as Germany and Spain, indicate that they are likely to adopt a narrower interpretation which will include only entities that directly provide telecommunications and Internet access services.

So, it is possible that data retention requirements could also apply to a search engine operator such as Google in certain Member States. Given the ubiquity of Internet search engines, it is hard to believe that law enforcement authorities may not at some point turn to a search engine operator to request personal data in order to fulfill some law enforcement interests. While the Data Retention Directive does not specifically mention search string data, it does require the retention of certain types of data about the user’s Internet connection (sometimes called “traffic data”) that can be so closely intertwined with search string data that it may be nearly impossible to separate them.

The Directive gives the EU Member States the option of requiring retention of the data between six and twenty-four months, and in exceptional cases even longer. Not all Member States have so far implemented the Directive, but the implementations that have so far been enacted, and the legislative proposals for implementation, indicate that many Member States are likely to select a mandatory retention period of at least one year, or even longer. For instance, in The Netherlands, a retention period of 18 months has been proposed, while legislation and proposals in the Czech Republic, France, Spain and the UK set it at one year. The length of these periods indicate that personal data may need to be kept for a substantially longer period than data protection rules may imply. In addition, the US Department of Justice has called for a two-year mandatory data retention proposal.

The differing approaches to the retention of search engine data under data protection law and data retention law demonstrate the tension between these two areas, and also show that the retention of search engine data must be judged under both of them. This is hardly the first example of a conflict between data retention and data protection, but it deserves more discussion in the context of search.

Tuesday, February 6, 2007

Gmail and Targeted Ads: is that the right issue?

When Gmail was launched in April 2004, there was an outcry among privacy advocates that its model of email scanning for advertisement purposes was a troubling new privacy invasion. So, with the hindsight of nearly three years, where do I think these privacy advocates were right, and where they were wrong? I’ll quote some of Google’s public statements on Gmail here.

Everyone agrees that email communications should be confidential. So, the question is whether a particular model of ad targeting violates that principle. All major free webmail services carry advertising, and most of it is irrelevant to the people who see it. Some services which compete with Gmail attempt to target theirs ads to users based on their demographic profile (e.g., gender, income level or family status). Google believes that showing relevant advertising offers more value to users than displaying random pop-ups or untargeted banner ads. In Gmail, users see text ads and links to related pages that are relevant to the content of their messages. The links to related pages are similar to Google search results, and are culled from Google's index of web pages.

Ads and links to related pages only appear alongside the message that they are targeted to, and are only shown when the Gmail user, whether sender or recipient, is viewing that particular message. No email content or other personally identifiable information is ever shared with advertisers. In fact, advertisers do not even know how often their ads are shown in Gmail, as this data is aggregated across thousands of sites in the Google Network.

All email services scan your email. They do this routinely to provide such popular features as spam filtering, virus detection, search, spellchecking, forwarding, auto-responding, flagging urgent messages, converting incoming email into cell phone text messages, automatic saving and sorting into folders, converting text URLs to clickable links, and reading messages to the blind. These features are widely accepted, trusted, and used by hundreds of millions of people every day.

Google scans the text of Gmail messages in order to filter spam and detect viruses, just as all major webmail services do. Google also uses this scanning technology to deliver targeted text ads and other related information. This is completely automated and involves no humans.
When a user opens an email message, computers scan the text and then instantaneously display relevant information that is matched to the text of the message. Once the message is closed, ads are no longer displayed. It is important to note that the ads generated by this matching process are dynamically generated each time a message is opened by the user--in other words, Google does not attach particular ads to individual messages or to users' accounts.

Some advocates expressed the concern that Gmail may compromise the privacy of those who send email messages to Gmail accounts, since the senders have not necessarily agreed to Gmail's privacy policies or Terms of Use. But using Gmail does not violate the privacy of senders since no one other than the recipient is allowed to read their email messages, and no one but the recipient sees targeted ads and related information.

In an email exchange, both senders and recipients should have certain rights. Senders should have the right to decide whom to send messages to, and to choose an email provider that they trust to deliver those messages. Recipients should also have certain rights, including the right to choose the method by which to view their messages. Recipients should have the right to read their email any way they choose, whether through a web interface (like Gmail, Yahoo! Mail, or Hotmail), a handheld device (like a BlackBerry or cellphone), a software program (such as Outlook), or even via a personal secretary.

On the Internet, senders are not required to consent to routine automatic processing of email content, such as for spam filtering or virus detection, or the automatic flagging or filing of messages into folders based on content. Email providers essentially act as personal assistants for subscribers, holding and delivering their email messages and carrying out various tasks (such as deleting spam, removing viruses, enabling search, or displaying related information). And of course, recipients have the right to forward, delete, print or distribute any message they receive.

So, is there a privacy issue with Gmail?

There are issues with email privacy, and most of these issues are common to all email providers. The main issue is that the contents of your messages are stored on mailservers for some period of time; there is always a danger that these messages can be obtained and used for purposes that may harm you, such as possible misuse of your information by governments, as well as by your email provider. Careful consideration of the relevant issues, close scrutiny of email providers' practices and policies, and suitable vigilance and enforcement of appropriate legislation are the best defenses against misuse of your information. I’ll come back to these issues later, since they’re the new set of privacy challenges in Web 2.0 services.

Monday, February 5, 2007

Are IP addresses "Personal Data"?

I worked with other privacy professionals in the European Privacy Officers Forum to answer the question: “Are IP addresses Personal Data?” A simple question doesn’t always have a simple answer. We concluded that the answer depends on the context. We concentrated specifically on the issue of ‘identifiability’ and where the dividing line is drawn between “personal data” and ”anonymous data”.

Personal data is very broadly defined in Article 2 of the Directive as “any information relating to an identified or identifiable natural person…”. Where this definition is applied unqualified then it may be interpreted in such a way that data will remain ‘personal’ and subject to the full remit of the law if individuals remain in any way identifiable. We believe that the concept of personal data should rather be defined pragmatically, based upon the likelihood of identification. In our view, it should not be the case that an organisation has to be sure that there is no conceivable method, however unlikely in reality, by which the identity of individuals can be established. This is a highly impractical approach, usually requiring considerable resource to be expended on disproportionate statistical analysis. The responsibility of organisations is to ensure that effective safeguards are put in place to prevent the data from being processed in such a way that it leads to identification. The rights, freedoms, and legitimate interests of individuals can more than adequately be protected if data is processed in such a way that all means likely reasonably to be used to identify the said person will fail. In making judgements about whether information is personal data, an organisation should consider the following factors:

1. How that data could be matched with publicly available information, analysing the statistical chances of identification in doing so;
2. The chances of the information being disclosed and being matched with other data likely held by a third party;
3. The likelihood that ‘identifying’ information may come into their hands in future, perhaps through the launch of a new service that seeks to collect additional data on individuals;
4. The likelihood that data matching leading to identification may be made through the intervention of a law enforcement agency, and
5. Whether the organization has made legally binding commitments (either through contract or through their privacy notice) to not make the data identifiable.

Considerations on all these issues are of course contextual, based upon an assessment on a case-by-case basis of the likely chances that identification may occur in any reasonably foreseen set of circumstances. In terms of ‘reasonableness’ or ‘fairness’, an additional aspect of this assessment may involve consideration as to the sensitivity of the information and any potential harm that could arise for individuals if data is later made identifiable.

However, some Member States, such as Belgium, Sweden and France, have interpreted data protection law to mean that if someone can be identified from certain data, no matter how technically or legally difficult it is to ascertain the identity of the physical person from such data, then the data is deemed to be ‘personal data’.

We suggest that a significant step can be taken in solving this issue by providing qualifying guidance on the limits of ‘personal data’. This should be pragmatic and emphasise that identification must be subject to the reasonableness standard. For example, a definition such as that given in §3(6) of the German Federal Data Protection Act could be used as a basis for this interpretation:
“Depersonalisation means the modification of personal data so that the information
concerning personal or material circumstances can no longer or only with a
disproportionate amount of time, expense and labour be attributed to an identified or
identifiable individual.”

The UK has adopted a pragmatic position: data are deemed personal if the individual to whom they relate is identifiable “from those data and other information in the possession or likely to come into the possession of the data controller” UK Data Protection Act 1998, section 1(1). As long as there is little or no chance of disclosure by the controller to a third party of information that could lead, in combination with data held by that person, to re-identification of individuals, then this approach seems more than reasonable.

The regulatory approach to IP addresses also illustrates the dilemma that the Directive’s sweeping definition of ‘personal data’ can cause. According to the stated position of the Working Party, “IP addresses attributed to Internet users are personal data and are protected” by the Directive. Article 29 Working Party, The Use of Unique Identifiers in Telecommunications Terminal Equipments: the Example of IPv6, Opinion 2/2002, WP 58, 10750/02/EN/Final, at 3. The Working Party reasoned that:

“data are qualified as personal data as soon as a link can be established with the identity of the data subject (in this case, the user of the IP address) by the controller or any person using reasonable means. In the case of IP addresses the ISP is always able to make a link between the user identity and the IP addresses and so may be other parties, for instance by making use of available registers of allocated IP addresses or by using other existing technical means”.
The Working Party have assumed that if an IP address is identifiable by one company
(e.g., an ISP) it is personal data as far as all other companies are concerned, even if
they have no access to the information that permits an association to the individual.
But this assumption is very questionable. ISPs typically do not divulge IP account
names. Indeed, many Member States have interpreted Article 6 of the 2002 Electronic Communications Data Protection Directive as prohibiting ISPs from divulging user information connected to IP addresses. If a third party cannot receive assistance from an ISP in associating an IP address with a particular user, the IP address is not personal data as far as the third party is concerned. From the third party’s perspective, the IP address is anonymous.

It is of note that this more pragmatic position is supported by jurisdictions with data protection legislation outside Europe, for example, Hong Kong. In May 2006, in a written reply to a member of the Legislative Counsel , the Secretary for Home Affairs (Dr Patrick Ho), outlines a policy position on IP addresses similar to that advocated above:

"An Internet Protocol (IP) address is a specific machine address assigned by the web surfer's Internet Service Provider (ISP) to a user's computer and is therefore unique to a specific computer. An IP address alone can neither reveal the exact location of the computer concerned nor the identity of the computer user. As such, the Privacy Commissioner for Personal Data (PC) considers that an IP address does not appear to be caught within the definition of "personal data" under the PDPO…” http://www.info.gov.hk/gia/general/200605/03/P200605030211.htm

While exact location and/or the particular user identity may not be required to qualify the IP address as personal data, Mr. Ho’s point that the IP address only identifies a machine is important. In fact, this raises a slightly different, but associated, aspect of the concept of identifiability. In determining whether an IP address can be considered an item of personal data in itself, consideration should be given to the fact that the number is not allocated to a natural person but rather to an item of networked equipment. Data generated through the use of such equipment may be the result of intervention by a number of individuals, perhaps the members of an extended family each making use of a home pc, a whole student body utilising a library computer terminal, or potentially thousands of people purchasing from a networked vending machine. We should note that the number of internet-connected devices is set to explode in the coming years. To illustrate the point, it is envisaged that in the future every light bulb will have an IP address, to turn it on and off, and to send a signal when it needs to be replaced. In fact, the logic of this argument could be applied to a variety of unique identifiers that are not necessarily associated with a particular natural person, for example, RFID numbers. Clearly the more divorced the use of such a number is from the identity of a single natural person, the less strong the argument for considering such ‘identifiers’ as an aspect of personal data.

Whether or not these identifiers are personal data will turn on the context in which they are collected and how they are stored and processed.