I spent time with a group of privacy experts recently. We were discussing the intersection of privacy and antitrust law. Traditionally, these two fields were very separate, with separate laws, separate regulators, and separate practitioners. But the rise of the data-processing monopolies like Google and Facebook is forcing these two fields to converge. When a monopoly like Google Search or Facebook is based on processing vast amounts of personal data, and when no competitor could possibly compete with these data-gorged monopolies, well, it’s obvious that antitrust law should consider forcing these monopolies to share data with potential competitors. Otherwise, these monopolies will carry on with their “data barrier to entry”. Data is an essential input into any of these existing or future services.
Existing monopolies, like Google Search, do not want to share their data with potential competitors. Duh. So, they are making public arguments that such sharing would create a serious risk of violating the privacy of their users. But is that true?
Google has resorted to public blogging to warn its (3 billion) users of the risks of court-mandated data sharing. “DOJ’s proposal would force Google to share your most sensitive and private search queries with companies you may never have heard of, jeopardizing your privacy and security. Your private information would be exposed, without your permission, to companies that lack Google’s world-class security protections, where it could be exploited by bad actors.” https://blog.google/outreach-initiatives/public-policy/doj-search-remedies-apr-2025/
Now, let’s unpack that statement. Google is clearly stating that it collects “your most sensitive and private search queries”. Its privacy policy makes it clear that it collects, retains and analyzes that data to run and improve its own services (not just Google Search). So, Google clearly analyzes your “most sensitive and private” data, despite the privacy issues to you: the privacy issues, according to Google, only arise if that data is shared with other parties.
Now think about Google’s money machine, its ads network. Doesn’t that network do exactly what Google is here claiming is a terrible thing for users’ privacy? Google ads network collects vast amounts of its users “sensitive and private” surfing history, and shares it with “companies you may never have heard of”. Indeed, that’s exactly what the ads network does today. Not coincidentally, an unrelated antitrust monopoly case is underway regarding the Google ads monopoly. So, let’s be clear, in the context of Google Search, Google claims sharing data with third parties would be terrible for users’ privacy, but in the context of Google ads network, all that sharing is just fine…
Privacy professionals should take a clearer look at the privacy implications of any court ordering Google to share Search data with competitors. Would that really raise any privacy issues? Some experts in the field are starting to discuss the issue: https://www.hklaw.com/en/insights/publications/2025/04/google-search-data-sharing-as-a-risk-or-remedy
Search is based on data mountains. They are different mountains. Each category of the data mountains has different privacy implications. We need to unpack data-sharing into its different categories to assess whether it has any impact on privacy.
The Index: the biggest data mountain is the Search index. That’s the index that Google Search creates by crawling the entire public web. It’s one of the largest, if not the largest, database on the planet. But it’s not a privacy issue: it’s just crawling the public web. Of course, there is public data on the public web, but it’s not a privacy issue to force Google to share such data with other parties, who could also access it on the public web.
User interaction data: with its 3 billion users, and over 20 years of operation, Google Search has the largest database of user interaction data on the planet. I’m guessing it’s 1 million times larger than its nearest competitor Bing. (Google can correct my guess if it wishes to.) This user interaction data is essential to teach a Search engine’s algorithm how to guess what someone intends to find when they type a query. If you have billions of examples of what people are searching for, you can train your search algorithms accordingly. If you don’t have that data, you don’t have a chance. So, would it be a privacy issue, as Google menacingly suggests in its blog post, if it were forced to share such data? It depends: yes, if it were forced to share search histories (i.e., search logs) with all of the personally-identifiable data that Google collects and shares. No, if it were forced to share anonymized data sets, such as anonymized search logs.
Fortunately, many years ago, Google introduced its policy to anonymize search query logs, after a number of months, in the interests of users’ privacy, and to respond to regulators’ pressure. I know something about that, since I worked on that privacy initiative, with my great former colleagues.
https://publicpolicy.googleblog.com/2008/09/another-step-to-protect-user-privacy.html
There is no privacy issue, none at all, with forcing a company to share anonymized user interaction data.
I get that Google is blogging as part of its anti-antitrust litigation strategy. It really, really, doesn’t want to share its data with potential competitors. Litigators will advance their clients’ interest, as best they can. The rest of us 3 billion users of Google Search can assess the intellectual honesty of their arguments. As far as I am concerned, there are profound privacy issues on the web: forcing the Google Search monopoly to share its non-personally identifiable data with potential competitors is not a privacy issue.
No comments:
Post a Comment