Saturday, May 13, 2017

probability - Estimate the number of elements by random sampling with replacement



Setup:





  1. N numbered balls are in a bag. N is unknown.

  2. Pick a ball uniformly at random, record its number, replace it, shuffle.

  3. After M samples, of which we noticed R repeated numbers, how can we estimate the value of N?



Clearly, after we've covered the set, only repeats will appear. However, there is a vanishingly small probability we've just missed one.



Simplification:



Assume we know $N


Possibly related:





Application:



Estimating the number of webcomics when I'm pressing "random," assuming random is uniform over the comics.


Answer



Chance of getting a repeat = (M-R)/N which we are going to call a "failure". Call this probability (M-R)/N = (1-P). Now the number of successful draws we have had (without getting a repeat) is M-R. Call this K. Now we just have a negative binomial distribution with R failures occurring.




The mean of the neg. binomial distribution is the number of successes before the Rth failure. In this case M-R. So M-R = pR/(1-p) with p being defined above.
 MR=(1(MR)/N))R/(MR)/N


 =(R(MR)R/N)(N/(MR))=NR/(MR)R=MR

 =M=NR/(MR)=M2RM=NR>N=(M2/R)M


No comments:

Post a Comment

analysis - Injection, making bijection

I have injection f:AB and I want to get bijection. Can I just resting codomain to f(A)? I know that every function i...