P02: Sampling; Monty Hall Problem

Monty Hall

sample

function

script

Bayes Theorem

Author

Thomas Manke

The Game

A showmaster presents three closed doors (1,2,3) to a candidate. Only behind one random door is a price (i.e a car). The canddiate is asked to pick one of the doors. Afterwards, the showmaster opens one of the other two doors which does not contain the price. Now two doors are still closed and the candidate has the choice to either stick to the original choice, or switch to the other closed door.

Query: which strategy is more likely to succeed in winning the price?

Simulation in R

Define the three doors, only one of which hides a price - the winning door “W”. The other two doors are loosers (L1 and L2)

Code

doors=c("W","L1","L2")
doors

[1] "W"  "L1" "L2"

Sampling

First assume that the doors are shuffled randomly

Code

doors=sample(doors) # Why does this amount to shuffling?
doors

[1] "W"  "L1" "L2"

Now the candidate picks a random door “pick” (hoping that it might be the winning door)

Code

pick=sample(doors,1) # Notice the difference between sample(doors) and sample(doors,1)
pick

[1] "L2"

A Function

Only the show master knows the winning door. Depending on the picked door, the show master will reveal one door that does not contain the price, and not the picked door. This is a more complex task than simple samping - so let’s write a function that depends on “pick” and the doors content

Code

open_door = function(pick, doors) {
  # masters choices are limited: do not reveal "win" and do not reveal "pick"
  choices = doors[ doors != "W" & doors != pick ]
  return (sample(choices,1))
}

Now let’s use the function to open a door

Code

open = open_door(pick,doors)
open

[1] "L1"

Notice that the showmaster does not really have a choice when the candidate has picked a loosing door (2/3 of the time).

Now the candidate has the choice to switch to the other remaining door (not open, not previously picked).

To repeat the question: Will the switching strategy be more successful?

Code

switch=doors[ doors != pick & doors != open]  
paste(switch)

[1] "W"

With three doors, the candidate has only two choices to stay with “pick” or to switch to “switch”

The above results from sampling will vary (for each candiate and participant of the R-course). Now let’s play this game N times. The goal is to count the number of successes for the switching strategy.

An R-script

We want to execute the commands above repeatedly. To this end the console is not very useful. Therefore we leave the console for a moment, and first collect the individual commands in a file (an Rscript). Thankfully Rstudio also has an editor for such purposes.

Task: Open (a new and empty) Rscript file. Include the definition of the function “open_door” and save the lines below in a file called MontyHall.R.

Tip: Almost all command lines are already in your history, where you have tested that they are working properly. In the history panel, you can select those lines and send them “To Source”, i.e the newly opened file. This should avoid redundant typing and errors. All you need to do is to wrap the commands into the for loop.

Code

#Comment Line: A small Rscript to simulate the Monty Hall Problem
doors=c("W","L1","L2")
N=10000
success=0
for (i in 1:N){
  doors = sample(doors)                           # shuffle doors
  pick  = sample(doors,1)                         # candidate picks one door at random
  open  = open_door(pick,doors)                   # show master picks one other door (!= pick != win)
  switch= doors[ doors != pick & doors != open]   # candidate has choice to switch
  if (switch=="W") { success=success+1}           # count if switching strategy is successful (= "win")
}

cat("successes with switching= ",success, "success_rate: ", success/N, "\n")

Congratulations! You have just written your first “R script”. Save it with some suitable name. Return to the console and execute the script using the source command

Code

source("MontyHall.R")

successes with switching=  6644 success_rate:  0.6644 
successes with switching=  6653 success_rate:  0.6653

You can share the script easily with all people who speak R – especially those who don’t like to switch.

Bayesian Treatment

If you prefer paper and pencil, this is how to solve this problem analytically.

Prior Probabilities:

\[ Pr(1=car)=Pr(2=car)=Pr(3=car)=\frac{1}{3} \]

Assume that the candidate picks door 1 and the showmaster opens door 2 (up to relabeling). Now the candidate would like to know

Posterior Probability: \(Pr(3=car|2=shown)= ?\)

Bayes Theorem (for events \(A\) and \(B\)): \[ Pr(A,B) = Pr(A|B)Pr(B) = Pr(B|A)Pr(A) \]

Updating Probabilities (given new data): \[ Pr(3=car|2=shown)=\frac{Pr(2=shown|3=car)Pr(3=car) }{Pr(2=shown)}\\ Pr(1=car|2=shown)=\frac{Pr(2=shown|1=car)Pr(1=car) }{Pr(2=shown)} \]

Updating ratios: \[ \frac{Pr(3=car|2=shown)} {Pr(1=car|2=shown)} =\frac{Pr(2=shown|3=car) }{Pr(2=shown|1=car) } \frac{Pr(3=car)} {Pr(1=car)} \\ = \frac{1}{1/2} = 2 \]

Result

\[ Pr(3=car|2=shown) = 2 \times Pr(1=car|2=shown) \]

Review

solving problems (in R) with simulations
writing functions to encapsulate more complex tasks
loops (for) and conditionals (if)
selection/exclusion in vector
sample()
writing and running R-scripts: source("Rscript.R")
(Bayes Theorem: updating probabilities)