T.R | Title | User | Personal Name | Date | Lines |
---|
1009.1 | some variants work | NIZIAK::YARBROUGH | | Fri Jan 06 1989 10:26 | 8 |
| If the coins are unbiased you can't get any information out of a
single flip and Head=true, Tail=false; you will tend to get exactly
50% yes-no responses. If each sample is based on truth=TWO heads
in 2 throws, false otherwise, you can begin to get something out.
Alternatively, if only the guilty lie when their coin is tails you
can get a significant result: the number of guilty will tend to
twice the number of 'YES' responses.
|
1009.2 | Slight misunderstanding, I suspect. | RDVAX::COOPER | Topher Cooper | Fri Jan 06 1989 11:41 | 22 |
| This sounds like a minor misunderstanding of a previously obscure
statistical survey technique which has in the last few years received
a lot of attention because it has been used in surveying AIDS victims
(and non-victims for control purposes).
The technique is as your friend Steve described it but instead of
lying if tails came up, the "subject" is instructed to then always
give the "incriminating" answer (in this case, "Yes, I have smoked
a 'funny' cigarette."). The surveyer cannot tell if any incriminating
answer is true or not, and what's more, unless the incidence of
incriminating behavior is near 100%, it is much more likely that
a specific incriminating response is due to the coin flip than to
sanctioned behavior.
If you survey 100 people and 72 of them give the sanctioned answer,
then your estimate for the population proportion is (72-50)/50 =
44%. The cost is that you have to sample 100 people to get the
same accuracy you would get from 50 people in a straight forward
survey. The benefit is that your 50 "effective people" are likely
to be much more honest.
Topher
|
1009.3 | I *think* that these are blind herrings | HERON::BUCHANAN | Andrew @vbo/dtn8285805/ARES,HERON | Fri Jan 06 1989 11:51 | 50 |
| > Alternatively, if only the guilty lie when their coin is tails you
> can get a significant result: the number of guilty will tend to
> twice the number of 'YES' responses.
But then you know that any individual who says 'yes' must be guilty,
contrary to the principle of protecting any *individual's* privacy.
> If the coins are unbiased you can't get any information out of a
> single flip and Head=true, Tail=false; you will tend to get exactly
> 50% yes-no responses. If each sample is based on truth=TWO heads
> in 2 throws, false otherwise, you can begin to get something out.
This is a compromise. It breaks the symmetry which is very
important, but still privacy is not protected here. If someone says
'yes', then what do we conclude?
P(guilty|yes) =
P(yes|guilty)*P(guilty)
-----------------------------------------------------
P(yes|guilty)*P(guilty) + P(yes|innocent)*P(innocent)
By Bayes' Theorem. Assuming that P(guilty) = p, then this expression
equals:
p/4
--------------- = p/(3-2*p)
p/4 + 3*(1-p)/4
while P(guilty|no) = 3*p/(1+2*p)
See that (assuming p =/= 0 or 1, which is reasonable):
P(guilty|no) >= P(guilty|yes). Which isn't keeping privacy. The guy
has revealed something about his probable history by what he said.
What we want is P(guilty|no) = P(guilty|yes), but to still have some
estimate of p which is consistent (ie. as n increases, our estimate of p
converges on the real p).
So instead of �, let's say we tell the truth we probability q.
Then we want, for all p in (0,1):
pq / (pq + (1-p)(1-q) ) =
p(1-q) / ( p(1-q) + (1-p)q )
=> q^2 = (1-q)^2 => q = �. Which is exactly want robs us of our consistency.
So, we need something a little more subtle than *just* tossing a coin.
Any ideas?
|
1009.4 | Where does the apostrophe go in "Bayes Theorem"? | AITG::DERAMO | Daniel V. {AITG,ZFC}:: D'Eramo | Fri Jan 06 1989 11:55 | 31 |
| Let p be the probability that the true answer is yes.
Suppose everyone flips a coin and tells the truth on heads,
and lies on tails. Suppose the probability of heads is q.
Then the probability of a yes answer is:
pq + (1-p)(1-q) = pq + 1 - p - q + pq = 1 - (p + q) + 2pq
The probability of a no answer is:
(1-p)q + p(1-q) = q - pq + p - pq = (p + q) - 2pq
Do these add to one? Yes. :-)
For a fair coin q = 1/2, and the probability of a yes answer
becomes 1 - (p + q) + 2pq = 1/2 - p + p = 1/2. So using
q=1/2 gives no information about p.
Suppose however one uses dice, and say, q = 1/3. Then the
probability of a yes answer is now 1 - (p + q) + 2pq =
2/3 - p + (2/3)p = 2/3 - p/3 or (2 - p)/3. Thus one can now
get some information from the proportion of yes answers.
However, I bet that using Bayes Theorem will show that in
either case (i.e., q = 1/2 or q not= 1/2) an individual's
answer does reveal information about the individual (unless
p = 1/2). More later.
Dan
|
1009.5 | oops | AITG::DERAMO | Daniel V. {AITG,ZFC}:: D'Eramo | Fri Jan 06 1989 12:00 | 4 |
| .2 and .3 came in while I was replying; .3 already contains
the "follow up" and the answer to the title of .4.
Dan
|
1009.6 | .2 & .3 are malordered | HERON::BUCHANAN | Andrew @vbo/dtn8285805/ARES,HERON | Fri Jan 06 1989 12:44 | 45 |
| > The technique is as your friend Steve described it but instead of
> lying if tails came up, the "subject" is instructed to then always
> give the "incriminating" answer (in this case, "Yes, I have smoked
> a 'funny' cigarette."). The surveyer cannot tell if any incriminating
> answer is true or not, and what's more, unless the incidence of
> incriminating behavior is near 100%, it is much more likely that
> a specific incriminating response is due to the coin flip than to
> sanctioned behavior.
Yes, this has to be a valuable technique in practice. But still
the guy who says 'yes' *may* be a smoker, whilst the guy who says 'no'
*cannot* be. With the figures you used above, 44 out of 72 are smokers.
If one is a libertarian or paranoid person, one could imagine
that this would enable a government to 'home in' on a particular subset.
The question is: does there exist a technique where we can extract
general information, without any loss of privacy for the individual?
I had an idea...
It's a slightly flippant idea, but it might be that it has a
serious application, in some different domain.
(1) Divide the individuals into two classes, A & B.
(2) Explain the question to those in in class A, and ask them
to reply. Each can lie or tell the truth, as they please.
(3) Ask those in class B to toss a coin each. If heads, goto (4)
if tails goto (5).
(4) Ask that person to tell the truth
(5) Ask that person to lie or tell the truth, as they please.
Suppose that x of class A say "Yes" and y of class B. Then how about
2*y-x as an estimate of the total number of smokers. This assumes that
the members of class A and the members of class B would behave the same if
asked to say yes or no, as they please. There may be a little care in
experimental design required to ensure that the members of class A are
in exactly the same state as class B. E.g. get *everyone* to toss a coin,
and open one of two envelopes on that basis (both enevlopes contain the
same message for class A) then make a decision.
Is this valid?
|
1009.7 | Not completely | RDVAX::COOPER | Topher Cooper | Fri Jan 06 1989 15:58 | 66 |
| RE: .6 (Andrew)
> ... does there exist a technique where we can extract general
> information, without any loss of privacy for the individual?
In a word: no. General information about any sampled group
about sensitive subjects can be used to stigmatize members of that
group. If we discover that 80% (to make up a figure) of AIDS
patients engage in socially unacceptable behavior, then we
can conclude that any particular AIDS patient (whether or not
they participated in the survey) probably engages in the
unacceptable behavior. And even if the survey results cannot
be generalized, then it can still be used to stigmatize the
individuals who participated in the survey (if 80% of the
people who participated in the survey beat their spouses, then
the survey can be used to label the people who took part as
spouse-beaters).
However, this does not rule out decreasing or eliminating the
specifically personal risk of someone in the "tell the truth"
group answering honestly with a truthfully "stigmatizable"
response, or, for that matter, the risk to someone in the
"always answer stigmatizable" group being lumped in (statistically)
with the truthfully stigmatizable group.
The method I described can be adjusted quite simply to reduce
the risk to the individual to any desired degree. Simply increase
the relative size of the "always answer stigmatizable" to the
desired level. If the instructions are to answer truthfully only
if two coin flips both come up heads, than a stigmatizable answer
is even less likely to indicate stigmatizable behavior. The
cost is, of course, that larger and larger groups are needed for
the same level of accuracy.
The method assumes that one response is stigmatizable
while the other would always be considered safe. The method will
not work if either response might be stigmatizable. In that
case the population should be divided by the initial coin-toss
(more likely: die roll) into three groups: always answer A, always
answer B and tell the truth. The first two groups would be ideally
equally proportioned, or, more sophisticatedly, proportioned according
to the relative risk of the two answers (the original method is
a specialization of this sophisticated proportioning).
Note that if there is no risk associated with one of the answers
then the 50% proportioning provides no additional protection to
the honestly stigmatizable, but increases the risk of stigmatization
to the honestly non-stigmatizable group.
A variant of this new method, would be to use one coin flip to
determine whether someone is in the "random answer" or "honest answer"
groups, then a second coin flip to determine in the former case
what the random answer should be.
Unless I have missed something, this is essentially the method you
have proposed, except the second coin flip is replaced with the
subject's impulse as a randomizer, and group A has been added to
estimate the characteristics of that randomizer. I see no benefit
to this, since it requires a much larger sample (to include group A),
is less reliable (since our estimate of the proportions of each
random answer is subject to sampling variation), and may deviate
from the ideal proportions (to see this note that if all "random
responders" are moved to make the same response then the method is
the same as the original, except we have added group A).
Topher
|
1009.8 | | KOBAL::GILBERT | Ownership Obligates | Sun Jan 08 1989 23:42 | 12 |
| Suppose we spin a roulette wheel, and use the table:
Black -> Answer "Yes"
Red -> Answer "No"
00 -> Tell the truth
Then with a large enough sample, we should get a significant result,
and knowing an individual's answer doesn't give enough information
to stigmatize him.
P.S. We could just use secret ballots. :^)
|
1009.9 | I guess they could wear a mask :-) | RDVAX::COOPER | Topher Cooper | Mon Jan 09 1989 15:21 | 18 |
| RE: .8
An excellent example of a device such as I was trying to describe
(I probably should have included a concrete example such as this
to clarify what I was saying. Thanks).
> P.S. We could just use secret ballots. :^)
Despite the smiley face it may be worthwhile mentioning the context
which makes this technique a useful one. Written questionaires
tend to be biased in response and accuracy with respect to people
who are partially or wholly illiterate in English. A verbal interview
allows the interviewer to assess "interactively" that the interviewee
understands what is being asked and to take corrective action if
not. Complete annonymity then rests on trust of the interviewer
and hence the problem.
Topher
|
1009.10 | there are three privacy concerns here | PULSAR::WALLY | Wally Neilsen-Steinhardt | Wed Jan 18 1989 13:51 | 32 |
| Note that .3 and .7 are raising privacy concerns that the test method,
correctly described in .2, was not intended to address.
The single concern motivating the test method as described was:
suppose that I as a subject give an incriminating answer. Could
this be traced back to me as an individual and used to incriminate
me? The method described in .2 removes this concern, since there
is no proof that the incriminating answer is true.
.3 raises a second concern: that an incriminating answer may raise
the subjective probability that the incriminating answer is true
for the individual. As discussed elsewhere, particularly in .3
and .4, this change to subjective probability can be minimized but
not eliminated. I personally would argue that this second concern
is less significant than the first, since this subjective probability
is never admissible as evidence in a criminal case and seldom ina
civil case. But other people have other standards of privacy.
.7 raises a third concern: that survey results may be used to
stigmatize a particular group. However, the connection between
group characteristics is often what is being sought by the survey.
This is a conflict in goals which no test design can eliminate.
For (a controversial) example: suppose a public health agency wants
to test the hypothesis that promiscuous homosexuals have an increased
risk of having AIDS. The agency says it needs the information to
design a prevention campaign. An advocacy group says it will be used
to inflame public opinion against promiscuous homosexuals. Any
test design which satisfies the agency will be objectionable to
the group, and vice versa. The real issue here is the relative
merits of the two arguments, not the test design. There is no general
answer here, since for most of us, changing the details of the
situation will change the side we favor.
|