If I open and close 100 doors randomly, how many doors will I need to open before I can be pretty sure I've opened all of them at least once?

Why not try simulating it? Here's some R code that will do that.

numDoors <- 100

 

totalDoors <- NULL

 

for (i in 1:20000) {

 

numDoorsOpened <- numDoors #start at minimum possible visits

 

doorsOpened <- sample(1:numDoors, numDoors, replace=TRUE)

 

done <- FALSE

 

while(!done) {

 

openDoorNumber <- sample(1:numDoors, 1) #randomly selects 

 

                                                                    one door to open 

 

                                                                     and close

 

doorsOpened <- c(doorsOpened, 

 

                            openDoorNumber) #add selected door 

 

                                                          number to an array

 

numDoorsOpened <- numDoorsOpened + 1 

 

if (!(FALSE %in% (1:numDoors %in% doorsOpened))) { 

 

  done <- TRUE

 

} # checks to see that all 100 doors have been selected

 

}

 

totalDoors <- c(totalDoors, numDoorsOpened)

 

}

 

summary(totalDoors)

 

Min. 1st Qu. Median Mean 3rd Qu. Max.

 

236.0 429.0 496.0 517.6 582.0 1566.0

On average, it will take you around 520 tries before you've visited each door at least once. This is close to the derived solution from the Coupon Collector's problem, which is E[X] = 100H(100) = 1005.187 = 518.7, where H(100) is the 100th harmonic number. The median is pretty similar at around 500 visits, and the most number of tries it took out of the 20,000 I simulated was 1,566. I'd shoot for taking 1000 samples. At 1000 samples, 99.53% of your samples will have visited all 100 doors at least once, which is probably good enough from what I can tell from your comments.

/r/AskStatistics Thread