Tag Archives: privacy

Privacy and Identity on the Internet

Jef­frey Rosen, law pro­fess­or at George Wash­ing­ton Uni­ver­sity (GWU), has called the cur­rent incarn­a­tion of the Inter­net “a digit­al world that nev­er for­gets” in a recent piece on pri­vacy for the The New York Times.

It’s an astute art­icle look­ing at the idea of seg­men­ted iden­tit­ies, the search for a way to safely con­trol our online iden­tit­ies, and some inter­est­ing spec­u­la­tion on digit­al repu­ta­tions and their pos­sible import­ance in the future.

Of par­tic­u­lar interest to me are two stud­ies Rosen weaves into his story on how pri­vacy on the Inter­net influ­ences our lives and how we can be nudged to become more pri­vacy aware:

Accord­ing to a recent sur­vey by Microsoft, 75 per­cent of U.S. recruit­ers and human-resource pro­fes­sion­als report that their com­pan­ies require them to do online research about can­did­ates, and many use a range of sites when scru­tin­iz­ing applic­ants — includ­ing search engines, social-net­work­ing sites, photo- and video-shar­ing sites, per­son­al Web sites and blogs, Twit­ter and online-gam­ing sites. Sev­enty per­cent of U.S. recruit­ers report that they have rejec­ted can­did­ates because of inform­a­tion found online, like pho­tos and dis­cus­sion-board con­ver­sa­tions and mem­ber­ship in con­tro­ver­sial groups.


Accord­ing to M. Ryan Calo, who runs the con­sumer-pri­vacy pro­ject at Stan­ford Law School, exper­i­menters study­ing strategies of “vis­cer­al notice” have found that when people nav­ig­ate a Web site in the pres­ence of a human-look­ing online char­ac­ter who seems to be act­ively fol­low­ing the curs­or, they dis­close less per­son­al inform­a­tion than people who browse with no char­ac­ter or one who appears not to be pay­ing atten­tion.

via @finiteattention

Market Segmentation and the PRIZM NE System

Mar­ket seg­ment­a­tion is a meth­od of group­ing people with sim­il­ar char­ac­ter­ist­ics, primar­ily for mar­ket­ing pur­poses.

A num­ber of years ago, USA Today described in detail the inform­a­tion large con­sumer seg­ment­a­tion busi­nesses track and use to group us. It’s an eye-open­ing read:

The [con­sumer seg­ment­a­tion busi­nesses] are pin­point­ing who lives where; what they’re most likely to read, drive and eat; how many kids they have; and where they shop. And they are doing it with unpre­ced­en­ted pre­ci­sion. They are going far bey­ond the char­ac­ter­ist­ics of people in cer­tain ZIP codes to details about people in spe­cif­ic neigh­bor­hoods — even indi­vidu­al house­holds. […]

Most of the inform­a­tion they gath­er is pub­lic: the Census and gov­ern­ment records of births, deaths, mar­riages, divorces, prop­erty deeds, tax rolls and car regis­tra­tions. What’s not pub­lic, people give away. They do it every time they fill out a war­ranty card, answer a sur­vey, buy a car or use their fre­quent shop­per­’s cards at drug­stores and super­mar­kets.

The art­icle notes that there were/are five com­pan­ies that offer this ser­vice to busi­nesses, and I decided to look fur­ther at the ser­vice offered by the old­est of these com­pan­ies: the 30 year-old Nielsen Clar­itas PRIZM NE sys­tem.

The sys­tem is fas­cin­at­ingly craf­ted, split­ting indi­vidu­al U.S. house­holds into 66 demo­graph­ic­ally and beha­vi­or­ally dis­tinct ‘seg­ments’. Each of these seg­ments con­tain inform­a­tion on a mem­ber­’s likely: age range, edu­ca­tion level, race, homeown­er­ship status, employ­ment status (and job type) and their typ­ic­al life­style pref­er­ences (e.g. likely travel des­tin­a­tions, favour­ite shops, typ­ic­al hob­bies, likely read­ing habits, etc.). These 66 seg­ments are then fur­ther seg­men­ted into one of 14 broad­er social groups by tak­ing into con­sid­er­a­tion their afflu­ence and loc­a­tion (i.e. urb­an, sub­urb­an, second city and town and rur­al).

These two doc­u­ments I man­aged to find are def­in­itely worth flick­ing through if you’re inter­ested:

Privacy and Tracking with Digital Coupons

Data col­lec­tion and min­ing can be quite luc­rat­ive pur­suits for many retail­ers, and tech­no­lo­gic­al advances are provid­ing them with more nov­el and extens­ive meth­ods of doing just that.

Data min­ing is a top­ic I’ve been fas­cin­ated with ever since I was intro­duced to it in uni­ver­sity, and this look at how digit­al coupons track us and provide retail­ers with detailed data is a worthy addi­tion to my vir­tu­al col­lec­tion:

Inven­ted over a cen­tury ago as anonym­ous pieces of paper that could be traded for dis­counts, coupons have evolved into track­ing devices for com­pan­ies that want to learn more about the habits of their cus­tom­ers. […]

Many of today’s digit­al ver­sions use spe­cial bar codes that are packed with inform­a­tion about the life of the coupon: the dates and times it was obtained, viewed and, ulti­mately, redeemed; the store where it was used; per­haps even the search terms typed to find it.

A grow­ing num­ber of retail­ers are mar­ry­ing this data with inform­a­tion dis­covered online and off, such as guesses about your age, sex and income, your buy­ing his­tory, what Web sites you’ve vis­ited, and your cur­rent loc­a­tion or geo­graph­ic routine – cre­at­ing pro­files of cus­tom­ers that are more detailed than ever, accord­ing to mar­ket­ing com­pan­ies. […]

Many com­pan­ies have the tech­no­logy – and cus­tom­ers’ per­mis­sion, thanks to the pri­vacy policies that users accept routinely without read­ing – to track minute details of people’s move­ments.

I’m mostly fine with this sort of track­ing as it is typ­ic­ally done on a large, imper­son­al level: com­plex algorithms are used to determ­ine when to send what vouch­ers to who, all without dir­ect human inter­ven­tion. The piece ends with a thought that is some­what close to my opin­ion on this par­tic­u­lar pri­vacy debate: “I would be con­cerned […] if they get very gran­u­lar and are track­ing me spe­cific­ally.”

via @Foomandoonian

The CCTV Trade-Off

That CCTV does­n’t sub­stan­tially help in redu­cing crime has been shown bey­ond reas­on­able doubt, pro­poses Bruce Schnei­er, so now the press­ing ques­tion is wheth­er or not the bene­fits secur­ity cam­er­as do afford are worth­while.

There are excep­tions, of course, and pro­ponents of cam­er­as can always cherry-pick examples to bol­ster their argu­ment. These suc­cess stor­ies are what con­vince us; our brains are wired to respond more strongly to anec­dotes than to data. But the data are clear: CCTV cam­er­as have min­im­al value in the fight against crime. […]

The import­ant ques­tion isn’t wheth­er cam­er­as solve past crime or deter future crime; it’s wheth­er they’re a good use of resources. They’re expens­ive, both in money and in their Orwellian effects on pri­vacy and civil liber­ties. Their inev­it­able mis­use is anoth­er cost. […] Though we might be will­ing to accept these down­sides for a real increase in secur­ity, cam­er­as don’t provide that.

In August 2009 Schnei­er dis­cussed a report that showed only one crime per thou­sand cam­er­as per year is solved because of CCTV and quotes Dav­id Dav­is MP say­ing that “CCTV leads to massive expense and min­im­um effect­ive­ness. It cre­ates a huge intru­sion on pri­vacy, yet provides little or no improve­ment in secur­ity.”

A Home Office study also con­cluded that cam­er­as had done “vir­tu­ally noth­ing” to cut crime (although they were effect­ive in pre­vent­ing vehicle crimes in car parks), but do “help com­munit­ies feel safer” (a case of clas­sic secur­ity theatre).

Identification through Anonymous Social Networking Data

Anonym­ity is “not suf­fi­cient for pri­vacy when deal­ing with social net­works” is the con­clu­sion from a study that has suc­cess­fully man­aged to de-anonymise large amounts of san­it­ised data from Twit­ter and Flickr.

The main les­son of this paper is that anonym­ity is not suf­fi­cient for pri­vacy when deal­ing with social net­works. […] Our exper­i­ments under­es­tim­ate the extent of the pri­vacy risks of anonym­ized social net­works. The over­lap between Twit­ter and Flickr mem­ber­ship at the time of our data col­lec­tion was rel­at­ively small. […] As social net­works grow lar­ger and include a great­er frac­tion of the pop­u­la­tion along with their rela­tion­ships, the over­lap increases. There­fore, we expect that our algorithm can achieve an even great­er re-iden­ti­fic­a­tion rate on lar­ger net­works.

There’s been some mer­it­ori­ous cov­er­age of this study. This from BBC News:

The pair found that one third of those who are on both Flickr and Twit­ter can be iden­ti­fied from the com­pletely anonym­ous Twit­ter graph. This is des­pite the fact that the over­lap of mem­bers between the two ser­vices is thought to be about 15%.

This from Ars Tech­nica:

It’s not just about Twit­ter, either. Twit­ter was a proof of concept, but the idea extends to any sort of social net­work: phone call records, health­care records, aca­dem­ic soci­olo­gic­al data­sets, etc.

via Schnei­er