Sigma 5: Not for Identification

I have my original Social Security card. I applied for it and got it when I was in High School and was about to get my first real job. Printed at the bottom is "FOR SOCIAL SECURITY AND TAX PURPOSES -- NOT FOR IDENTIFICATION". This "Not for Identification" business is often misconstrued. It is usually taken to mean that your Social Security number is not some kind of ID or identification number. That's not what is meant. Because the very same card also says "For Social Security and Tax Purposes". What is actually meant is that the card itself is not to be used for identification. That's because the card itself was not designed to provide positive identification.

And that is a nice bridge. This post is an update to a previous post about Positive Identification (see http://sigma5.blogspot.com/2016/04/positive-identification.html). If you reread it you will find it links to previous blogs. I have been weighing in on privacy issues for some time now. Although the trend continues the details keep changing. And there have been several important changes since my last post over two years ago.

I spent some time in the post I linked to on the fingerprint identification system Apple had implemented. You placed your finger on the correct spot on the phone and it would read your fingerprint. If it recognized it the phone would unlock. The phone positively identified you by analyzing your fingerprint.

Apple has since moved on to facial recognition. Smart phones have had cameras in them for some time. Even my old fashioned flip-phone now comes equipped with a camera. My phone doesn't have enough computer power to do facial recognition but newer iPhones do. They take a picture and supplement it by measuring other characteristics of your face. If it's a match you have been positively identified and your phone unlocks.

It is important to recognize there are limitations. First of all, what the phone is doing is matching your current face to one that was identified to the phone during the phone's "setup" procedure. So the phone knows that, relatively speaking, you are you. But absolutely speaking it doesn't know who you are. Also, it is possible to fool the recognition system. This was true of the older fingerprint system and it continues to be true of the facial recognition system. But everybody expects that the process will be updated and enhanced as time goes by so that it becomes harder and harder to do this. Even now, it takes a lot of skill and effort to fool either system. It takes deliberate effort and a considerable amount of knowledge.

But the identification is only relative. The phone recognizes you as the owner because it has been told that you are the owner. But it doesn't know who you are. And the same is true more generally. The analysis I did pointing out that there is really no way the current system can absolutely tie a specific person to a specific birth certificate is still true. And using CODIS or some other DNA based system to create an absolute connection continues to get easier and easier from a technical perspective.

CODIS has added 7 additional STRs in 2017 so it is now using using 20. But nothing has moved from a political perspective. There are still tight restrictions on what gets put into a CODIS database. There has been no move to CODIS newborns, for instance. And if smartphone makers are thinking about using DNA for identification I don't know about it.

But there has been big developments on the identification front. These developments are with respect to relative identification. But they are so pervasive and extensive that they have rendered the difference between relative and absolute identification moot.

We have known for a long time that tech companies were collecting a lot of data about us. Google famously saves every search ever made. Initially this was supposed to be so they could analyze it and optimize their algorithms to give you a more useful answer. But it soon became apparent that they were not just using it as some sort of anonymous pile of data that helped in search optimization. They were using it to identify each and every one of us. They would then develop a profile of each of us which they would sell to advertisers. The idea was this would allow advertisers to narrowly target their marketing to just the people most likely to be interested in the product.

As an example of how this worked I once searched Amazon for shredders. I didn't need one but my mother, who did not have a computer, needed one. I soon started to notice that wherever I went on the web an ad for an Amazon shredder would soon follow. This behavior persisted even after I went back and bought a shredder from Amazon for my mother. That was modestly entertaining and little or no harm was done to me or anybody else. So this kind of behavior didn't seem all that bad either to me or pretty much anybody else. And that's the kind of mental model people had about what was going on.

Okay, so the NSA was sweeping up all this data. That was the government and they shouldn't be doing that sort of thing. At least so went the argument by myself and many others. (There were, of course, lots of people who were okay with this and other intrusive behavior by the government.) But my point is that if we put this sort of thing on a scale the NSA was generally considered closer to the "bad" end than people like Google. And a large number of people were of the opinion that it was all fine.

Then the 2016 election happened. And over time we have learned a lot more about what tech companies in general and Facebook in particular have been up to. And more and more people have become very angry. There is something in the business called an EULA, an End User License Agreement. We have all had to deal with them. They are long documents full of impenetrable legalese that even expert lawyers can often not make sense out of. In the backs of our minds we are all pretty sure that there is stuff in them that we would not like if we understood what it was and what it meant. But you can't get around EULAs. Everybody uses them so you can't just go to the next company.

And they are all bad to one extent or another so it is impossible from a practical point of view to go with the company that has the least onerous EULA. They are what is called a "coercive contract". At least one party (us) is effectively powerless in the negotiation. So we don't read them. We just click the "Agree" button and move on. We have all made a deal with the devil. If we are going to have access to these compelling tools and gadgets we are going to have to put up with a certain amount of stuff we would rather not have to. But if we sign up for Facebook, for instance, we expect the bad behavior to be confined to the relationship between us and Facebook. And we did after all "Agree" to Facebook's EULA.

But we have found out that it is far worse than we thought. We expect Facebook to use what it has learned about us to try to get us to sign up for more Facebook stuff. And we expect Facebook to sell profile information to advertisers so that Amazon can pester me with ads for shredders. But we expect it to stop at that. But it turns out it didn't.

Facebook has a program that allows companies to build and run applications within the Facebook environment. Those applications can harvest information. And the information is not limited to what we tell the application. A popular type of application is a cute quiz. "How much do you really know about Star Wars?", or about pop stars, or fashion, or cars, or whatever. Certainly these quizzes can be constructed to collect information that advertisers would find valuable and, therefore, pay money for. That sort of thing seems fair. But these cute applications (they are designed to be cute so that they will be popular so that lots of people will install them) are not limited to harvesting the data you provide while answering the quiz. It turns out that they get access to all the information Facebook has on you.

That's bad. I'm pretty sure it's legal because they would be idiots to not put the necessary language into their EULA. But this "they get all the that Facebook has on you" degree of badness is just the first and least bad level of badness. It turns out they also get access to what Facebook knows about your friends. That's the second level of badness. And this behavior is probably legal because of the Facebook EULA. There is probably some language in there saying this sort of thing is legal. But it turns out there is a third level of badness.

Remember the bit about how I was getting those Amazon shredder ads everywhere. I did my search not in Google or Facebook but on Amazon's web site. So only Amazon knows I did the search. What's going on is that advertisers and the companies they do business with like Amazon and Facebook share data in networks. Amazon shared the information that I had done a search on shredders to its network partners and they placed "Amazon shredder" ads on web sites that I later visited.

It turns out that Facebook does the same thing. They are part of these information sharing networks so they have access to what's happening on sites that are far away from anything Facebook owns or operates. So Facebook has a profile on people like me who have NEVER had a Facebook account. And people like me have never signed an EULA with Facebook or any of the application providers Facebook hosts on their platform.

We have slowly found this out as revelations have trickled out as people have looked at how the 2016 election actually played out. Facebook has a "commercial" interface so that people who want to make a buck can build and run an application to run on the Facebook platform. But they also have an "educational" interface so that people doing research can also have access to the Facebook platform and Facebook data. This latter interface is given wider latitude due to it's presumably non-commercial and beneficial intent.

A Cambridge University Don (professor - Cambridge is in the United Kingdom) took advantage, and as we now know, allowed a company called Cambridge Analytica to harvest vast amounts of data about Americans from Facebook. First it was data on 50 million people. Then it was data on 87 million people. The actual number is and probably will never be known. And we know they sucked a vast amount of data out of Facebook. And we know we don't know where it all ended up. Facebook at one point asked for it all back. Fat chance.

And that's just Cambridge Analytica. There is certainly no technical reason dozens or hundreds of others could not have done the same thing. And we know that Cambridge Analytica was able to harvest data on users who signed up for one of the applications they put together. They all signed Cambridge's EULA. But we also know that this group numbers less than a million. We get to 50 and later 87 million because they were able to collect data on "friends" then "friends of friends" and so on. All these people at least signed the Facebook EULA. But were they also able to collect data on people like me, people who have never signed up for Facebook? The answer is unclear.

So it turns out that Facebook knows a lot about each of its users. The NSA would probably like to know as much about people as Facebook knows. So Facebook can positively identify its users. The positive identification is relative. They can't tie a specific user to a specific birth certificate. But they know so much about that person it doesn't matter. They can more positively identify a person that a bureaucrat at the bureau that issues driver's licenses, or voter registration cards, or passports. They can do a better job of positively identifying a person that the government can.

And Facebook has taken all the heat. But the same is true of Google. Remember they have all that search history (and lots more). It is probably also true to a lesser extent of Apple and Microsoft and a number of other companies. (So far the spotlight has shown brightly on Facebook and left the others in the shadows.) We now have a completely new method of personal identification, one that I did not imagine as recently as two years ago. You can now be positively identified by your online profile.

Most of us now live on our smartphones. (Again, I am an exception.) Back in the stone age of personal computers Intel was going to put a serial number in their 80486 chip. The privacy advocates of the day talked them out of it and treated this as a big victory for privacy, the ability to use computers and remain anonymous. But while they won this tiny skirmish they lost the war. There are hundreds of numbers on smartphones that can easily be accessed that provide a unique identification for a specific device.

Microsoft even pioneered a process for creating a GUID, a Globally Unique IDentification. The process guaranteed that it would never generate the same number twice. Microsoft uses GUIDs all over the place in their software. If you can get access, and it is easy to do, to any of these GUIDs you can uniquely identify a device like a PC (I do have and use those) or a tablet running a Microsoft application or a smartphone running a Microsoft application. And creating GUIDs is not that hard to do. So you can't avoid the problem by avoiding Microsoft.

You might just as well try to have a presence on the Internet without ever "Agree"ing to an EULA. Other vendors have figured out how to generate their equivalent of a GUID. Then there are all those numbers that behave like a serial number. Every network interface has a MAC address. It is effectively a serial number. Lots of software uses a "License key" or an "Activation key". They are both effectively serial numbers. IP addresses often behave like a serial number. The list goes on and on.

Companies like Facebook can harvest GIUDs and MAC Addresses and license/activation keys and tie a specific profile to a specific device or small list of devices and build up a positive identification. And they can and have done it better than the government.

I am pro-privacy. But I am also realistic. I have argued in a number of posts that when it comes to privacy the horse left the barn long ago and that there is no effective way to get the horse back.in the barn. And even if you did the horse is likely to escape no matter how much effort you put into horse-proofing the barn.

I think we need to accept the fact that privacy is not possible any more. That means we need laws, regulations, and social norms to constrain how we and our institutions and our businesses behave in a world where the technology exists that permits the powerful to peer pretty much anywhere they want. There is technology like encryption that can close some of the doors that let the powerful or just the technologically sophisticated in. This sort of thing is helpful and should be encouraged. But it doesn't protect us from those who have access to the inside like Facebook and whoever they license or enable.

This means that we must outlaw behavior that is technologically possible and often easy to do. We must also demand a high level of tolerance when it comes to what people are permitted to do. No matter who you are there are some behaviors other people engage in that you don't like, in some cases you don't like it a lot. As a society we must find ways to constrain the actions you are allowed to take in response to the behaviors you dislike. If you violate those constraints you need to be punished.

This at first seems like a new and unnatural way for people to behave. But a thousand or so years ago we all lived in small villages. It was then easy to look into a doorway as doorways frequently had no doors. But social morays constrained people from looking into doorways or acknowledging that certain behaviors were taking place in some public place. This was all enforced by shunning and other social actions. Society is now a global enterprise encompassing billions of people, millions of companies, and thousands of governments. Social norms alone are not going to work for us now.

At first blush it might sound like I am advancing a Libertarian agenda. And that is half right. Libertarians believe that legal prohibitions on behavior should be kept to a minimum. That's the part where my position coincides with the libertarian point of view. Where I differ is that there also needs to be legal prohibitions that outlaw violating the new norms. The government must step in, sometimes in a heavy handed way, to stop people, organizations, and institutions from doing things they want to do, namely going after people they disagree with and prohibiting behavior they don't like.

Facebook is a rich and powerful corporation with many fans. It will take a very powerful institution to be able to force them to change their behavior. They won't do it on their own. The only institution capable of doing that is a large powerful government doing intrusive things. And that kind of government is one Libertarians vehemently oppose.

We are a very divided society right now. And at its most fundamental level what divides us is our vision of how things are and how things should be. Until we come to a common vision the sort of things I am talking about are impossible. Even if a common vision were possible the issues I am talking about will be very hard to resolve. The most likely result of this current division is gridlock with no progress in any direction.

Maybe that is for the best. It gives all of us time to think about these issues and decide what we think about them and where we stand. But if recent events tells us anything they tell us that instead of thinking about these hard problems we will chase the next shiny object and the one after that. Then we will wonder how we got into another fine mess.

Sigma 5

Wednesday, April 18, 2018

Not for Identification

No comments:

Post a Comment