Wednesday, February 14, 2018

probability - What is the chance of picking y in a set of x objects, given x chances to pick at random?



Suppose you have a set of x objects. In it is an object y. Suppose you pick at random one object from this set. This set is uniformly distributed. There is a 1 / x chance of picking y. Let's say that you pick x times from the set, then eliminating it from the set. This means there is a 100% chance of picking y. But, let's say that after you picked an object from this set, it is not eliminated. What is the probability of picking y in that case? Would this (Link 1) apply? No, because it presumes you leave the marbles out of the bag, which is not what I mean. That would mean that logically scaling it up, there would be a 100% chance of picking object y, which I know not to be true. Or, would this (Link 2) apply?



To simplify things, let's use an example.
We have a bag with 10 marbles. 9 of them are white, and one is black. You pick one marble at a time at random, then drop it back in. If you do this 3 times, according logically, this would yield a 3/10 chance of picking the black marble. Now, suppose we scale this up to picking 10 marbles. At a quick glance, a 100% chance would be the logical answer, but if you think about it longer, you can pick the same marble more than once. So what is the probability, in this case, of picking the black marble?




Link 1: Probability: best chance of picking a desired marble out of 10



Link 2: Picking a uniformly at random element from a random set


Answer



In your example of 9 white marbles and 1 black marbles, you say that if you pick 1 marble out of those 10 3 times, you have a chance of 30% of picking the black marble (at least) once. That is actually a very common reasoning mistake, and you basically realized this yourself, for when you pick a marble $10$ times, we know it is of course not going to be $100$%, and when you pick a marble $20$ times, it is certainly not $200$%!



So, what is the mistake? Well, as again you say yourself, we can pick the same marble multiple times (indeed, when we pick $20$ marbles, we have to pick at least one marble multiple times). And in fact that goes for the black marble as well. And that has the following effect:



Let's just look at the case where we twice pick 1 marble (again, with replacement). Let $B1$ be the event of picking the black marble for the first pick, and let $B2$ be the event for picking the black marble the second time. Now, we can all agree that $P(B1)=10$%, and that $P(B2)=10$%. OK! But what is the chance of picking the black marble at least once between these two picks? That is, what is the chance of picking the black marble at the first or the second pick?




By the mistaken logic, it would be $P(B1) +P(B2)=20$%. However, that only works if $B1$ and $B2$ are mutually exclusive, i.e. if they cannot both happen. But that is of course not the case: they can both happen. And it is because they can both happen, that the chance of either one of them happening will be less then the sum of them happening by themselves.



Here is the general formula that is always true:



$$P(A \cup B) = P(A) + P(B) -P(A \cap B)$$



In here, $P(A)$ and $P(B)$ are of course the chances of $A$ and $B$ happening as individual events. $P(A \cup B)$ is the chance of either $A$ or $B$ happening (or both), and $P(A \cap B)$ is the chance of $A$ and $B$ both happening.



Now, if $A$ and $B$ are mutually exclusive, then $P(A \cap B)=0$, and hence $P(A \cup B)=P(A) +P(B)$. But if they can both happen, then $P(A \cap B)>0$, and hence $P(A \cup B)


Going back to the black marble case, $P(B1 \cap B2)=\frac{1}{10}\cdot \frac{1}{10}=\frac{1}{100}=1$% Hence, $P(B1 \cup B2)=P(B1) +P(B2)-P(B1 \cap B2)=19$% ... and not 20%!



And of course likewise, picking a marble 3 times will end up giving us a less than 30% chance of picking the black marble at least once, and picking a marble 10 or 20 times will actually end up well below 100%



Here is yet another way to think about it. Suppose you number the marbles 1 through 10, with the black one being number 1. Now, bewen the first two picks, what can be the possible outcomes? Well, you could pick marble 7 for the first pick, and marble 4 for the second. Or: marble 3 for the first, and marble 3 again for the second. We can write these outcomes as $(7,4)$ and $(3,3)$ respectively.



Now, how many possible outcomes can there be for the first two picks? Well, any outcome is $(x,y)$ wih both $x$ and $y$ ranging from 1 to 10, so that gives us 100 possible outcomes. OK! And how many outcomes lead to the black marble being picked the first time? That is of course all the outcomes of the form $(1,y)$, of which there are 10 ... which is why $P(B1)=\frac{10}{100}=10$%. And there are also 10 outcomes of the form $(x,1)$, which is why $P(B2)=\frac{10}{100}=10$% as well. OK! But are there 20 possible outcomes that contain a $1$? NO! Because when you add the 10 and 10 you double-count the $(1,1)$ outcome. So, there are really only 19 outcomes out of the 100 possible outcomes where you have a 1.



OH, and there are only 18 in which you pick the black marble exactly once, so this also tells you that a question like 'what is the chance of picking the black marble when you pick $x$ times?' is ambiguous, because it is not clear whether you mean 'picking the black marble at least once' or 'picking the black marble exactly once.




So finally, what then is the chance of getting at least one black marble when picking 3 times? Well, the easiest way to calculate that is not to consider the outcomes where you get the black marble in the first, second, and third try, and then try to remove the double- or even triple-counts, but rather to consider the possible outcomes where you do not get any black marble at all between these three picks. And how many possibilities are there for that? Well, out of the $10*10*10=1000$ possible putcomes $(x,y,z)$ there are $9*9*9=729$ where you don't get any black marble, and so there are $1000-729=271$ possible outcomes where there is at least 1 black marble, making the chance of getting at least one marble $\frac{271}{1000}=27.1$%



In terms of probabilities, you can do this as well:$ P(B1\cup B2 \cup B3)=1-P((B1 \cup B2 \cup B3)^C)=1-P(B1^C \cap B2^C \cap B3^C)=1-P(B1^C) \cdot P(B2^C) \cdot P(B3^C)= 1- 0.9 \cdot 0.9 \cdot 0.9 =1-0.729=0.271$



OK! And for picking the black marble at least once when picking a marble 10 times? Then it would be $1-(0.9)^{10}\approx 0.65$



And so what is the general formula for the probability of picking one specific object $y$ at least once out of $x$ objects, when making $n$ picks with replacement?



It will be $1-(\frac{1}{x})^n$



No comments:

Post a Comment

analysis - Injection, making bijection

I have injection $f \colon A \rightarrow B$ and I want to get bijection. Can I just resting codomain to $f(A)$? I know that every function i...