Enjoy R: Simulations of the Monty Hall problem

The Monty Hall problem is a recurring and charming game in which the probability theory plays an essential role.

A very simple and clear explanation may be found here: http://www.youtube.com/watch?v=7u6kFlWZOWg.

Anyway, let’s just summarize how it goes.

There are three shut doors: behind two of them there is a goat, whereas the remaining door hides a car.

There is a player that obviously aims to win the car, and a host, called Monty Hall, who is typically aware of where the car is.

However, in a variant of the game, he could also ignore that.


MONTY HALL PROBLEM (standard version)

In the standard version of the game, Monty Hall knows which door hides the car.

He lets the player choose one door, then:

  • if the player guesses the right door, he randomly chooses to open one of the two remaining doors hiding goats;
  • if the player gets wrong, he opens the remaining door hiding a goat.

In both cases, Monty Hall finally asks the player to change the door initially chosen.

Should the player do that? And, above all, should switching be relevant from a probabilistic point of view?

The answer may appear very strange, but the truth is that the most convenient choice is to switch.

This is because only one out of three times the player was initially right and Monty Hall indicates him the “bad” door; instead, two out of three times the player chose a door which hides a goat and the host brings him on the right track.

The same reasoning can be applied to the more generic framework where there are more than three doors (say, n doors) and Monty Hall opens n-2 doors, where there are all goats. The player’s final choice is always between two doors.

I’m going to show this in R.

MontyHall <- function(n_times = 1000, n_doors = 3) {

  if(n_doors < 3) stop("There must be at least 3 doors")

  mean(replicate(n_times, {

    # door with the car
    U <- sample(1 : n_doors, 1)

    # first door chosen by the player
    X <- sample(1 : n_doors, 1)

    # doors opened by the host (there are all goats)
    if(X == U) {

      V <- sample((1 : n_doors)[- U], n_doors - 2)

    } else {

      V <- (1 : n_doors)[- c(U, X)]

    }

    # if the player switches, will he/she find the car?
    (1 : n_doors)[- c(V, X)] == U

  }))

}

set.seed(1)

MontyHall()
[1] 0.667

MontyHall(n_doors = 100)
[1] 0.994

By repeating the game 1000 times, using 3 doors, the car is 667 times behind the door Monty Hall indicates to the player.

Doing the same with 100 doors, it happens 994 out of 1000 times (99%).

Demonstrating with Bayes’ Theorem

The doors have all the same probability of hiding a car:

daum_equation_1412515489721

The player chooses door 1.

Let us denote with O the event where the host chooses to open all the doors from the third to the n-th one.

If door 1 has the car, the host must select n-2 out of n-1 available doors. The binomial coeffient says that the possible combinations are n-1.  So:
latform1

If door 2 has the car, the host has just combination O at disposal. So:
latform2

If door j (with j > 2) has the car, the host cannot select combination O, because one of its doors hides the car. Thus:
latform3

Using Bayes’ formula, we can calculate P(A2 | O), which is what we aim to find, since the player chose door 1 and O goes from door 3 to door n.
latform

A VARIANT OF THE MONTY HALL PROBLEM

What would happen if Monty Hall didn’t know the door hiding the car?

In the framework with n>3, he would open a set of n – 2 doors where there might be the car. In this case, the game would end.

In case there were not the car behind the doors randomly opened by the host, then the two remaining doors would share the same probability of hiding the car.

Let’s implement this variant in R:

MontyHallblind <- function(n_times = 1000, n_doors = 3) {

  if(n_doors < 3) stop("There must be at least 3 doors")

  mean(na.rm = T, replicate(n_times, {

    # door with the car
    U <- sample(1 : n_doors, 1)

    # first door chosen by the player
    X <- sample(1 : n_doors, 1)

    # doors opened by the host (there may be the car)
    V <- sample((1 : n_doors)[- X], n_doors - 2)

    # if the car appears the game ends
    if(U %in% V) NA else

    # if the player switches, will he/she find the car?
    (1 : n_doors)[- c(V, X)] == U

  }))

}

set.seed(1)

# trying the function with 3, 8, 20 and 100 doors
sapply(c(3, 8, 20, 100), MontyHallblind, n_times = 500000)
[1] 0.5009758 0.5013081 0.4997896 0.4969436

By switching door, there is always the same probability of winning and losing.

Demonstrating with Bayes’ Theorem

Suppose as before that the player chooses door 1 and that O is the event where the host chooses to open all the doors from the third to the n-th one.

This time, O never gets conditioned by anything: Monty Hall has no knowledge about where the car has been collocated, so he opens doors randomly.

Thus, from this viewpoint the implementation of Bayes’ formula gets easier; however, we have now to consider that the game continues only if the host opens doors which do not hide the car.

So for door 1, the formula becomes:
A1

Since in the numerator the conditioning event indicates that the car is behind door 1, we can rewrite:
A12a

which is finally equal to:
A13

Similarly, for door 2 we get:
A22

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: