Q: “You recently wrote an excellent article detailing the privacy implications of Visa and MasterCard’s new campaign to sell consumer information they collect to online advertisers. They say that this information isn’t personally identifiable, but I’m worried that this information might actually be used in some way to identify me. Am I right to be worried?”
A: As I mentioned in my article, companies like MasterCard and Visa collect an enormous amount of information about their cardholders, such as where you go, what you buy, and how much you spend. From this, they use algorithms to determine how you spend your time, your interests, and your tastes.
They get away with selling this information to advertisers by simply not including your individual identity, or PII (personally identifiable information).
PII includes your name, of course, but also your address, age, gender, birthdate, schools attended and jobs held, among other information. This is important, because it’s been shown that nearly 90% of people in the U.S. could be identified by simply knowing their birthdate, ZIP code, and gender.
MasterCard and Visa can’t sell any information about you without getting in big trouble, so they hide certain bits to meet privacy standards. This is called “anonymizing” data. Many privacy laws allow data to be shared by companies if it is anonymized first.
The idea behind this is that by anonymizing your data, it theoretically makes it impossible to match any record with the person whose action it records.
Anonymity Is Not the Same Thing as Privacy
However, the anonymization process is an illusion. Why? Anonymization doesn’t work unless you remove so much information that the data becomes almost useless.
The Internet has so many available public databases that any record with enough information on someone’s actions has a very good chance of matching identifiable public records, and thus, an identity.
For example, a few years ago, researchers at the University of Texas took “anonymous” movie rental data released by Netflix, and were able to link some of this data to online movie reviews by specific individuals. They were able to find out these people’s political preferences and other potentially sensitive information.
In other words, anonymization is not an absolute protection.
More than that, it creates a false sense of security. Any data that is valuable enough to sell to advertisers has enough information on it to be de-anonymized.
The Implications of Releasing ‘Anonymous’ Data
Releasing (not to mention selling) anonymous data, then, has profound implications.
On the one hand, anonymous data is a boon for researchers. Anonymous medical data is enormously valuable to society for long-term pharmacology studies.
On the other hand, in this age of wholesale surveillance where it seems everybody collects data on us all the time, anonymization is much more fragile and riskier than it initially seems.
Researchers are currently working on developing algorithms that allow for the secure release of anonymous data. Until they do, though, remember that when companies like MasterCard and Visa insist that the data they are selling is “anonymous,” it is really anything but.