-
Title: Bayes’s Theorem and Total Probability
-
Series: Probability Theory
-
YouTube-Title: Probability Theory 8 | Bayes’s Theorem and Total Probability
-
Bright video: https://youtu.be/Y-MOGLJLlnY
-
Dark video: https://youtu.be/1nAbhTzpKb4
-
Ad-free video: Watch Vimeo video
-
Quiz: Test your knowledge
-
Dark-PDF: Download PDF version of the dark video
-
Print-PDF: Download printable PDF version
-
Exercise Download PDF sheets
-
Thumbnail (bright): Download PNG
-
Thumbnail (dark): Download PNG
-
Subtitle on GitHub: pt08_sub_eng.srt
-
Timestamps
00:00 Intro
00:17 Bayes’s theorem
01:20 Law of total Probability
04:51 Example: Monty Hall problem
09:35 Outro
-
Subtitle in English
1 00:00:00,329 –> 00:00:03,519 Hello and welcome back to probability theory.
2 00:00:03,719 –> 00:00:09,086 and as you know, first I want to thank all the nice people that support this channel on Steady or Paypal.
3 00:00:09,400 –> 00:00:17,257 Now, in todays part 8 we will talk about 2 important formulas. Namely about the total probability formula and about Bayes’s theorem.
4 00:00:17,457 –> 00:00:21,751 This formula of Bayes is so famous that we should immediately start with this.
5 00:00:21,951 –> 00:00:29,176 However it’s not so complicated at all, because we already know the conditional probability of an event A under B.
6 00:00:29,614 –> 00:00:34,768 Which is simply given by the probability of the intersection divided by the probability of B.
7 00:00:34,814 –> 00:00:41,001 and now of course we can also flip the roles and look at the conditional probability of B under the event A.
8 00:00:41,429 –> 00:00:45,778 Then we also have the intersection, but then we divide by probability of A.
9 00:00:46,371 –> 00:00:50,335 However, the important part here is: the intersection is the same.
10 00:00:50,535 –> 00:00:53,731 Hence with this part we can put both equations together.
11 00:00:53,931 –> 00:01:03,992 Which leads us for left-hand side to P(A|B) times P(B) and on the right hand side it leads us to P(B|A) times P(A).
12 00:01:04,471 –> 00:01:08,144 and in fact, this is what we call Bayes’s theorem.
13 00:01:08,571 –> 00:01:16,001 In this order it’s easy to remember, because here we have the condition B and P(B) and here the condition A and P(A).
14 00:01:16,429 –> 00:01:20,743 Ok, at the end of the video I can show you how we can use this formula.
15 00:01:20,943 –> 00:01:25,141 However before we do this lets talk about the law of total probability.
16 00:01:25,529 –> 00:01:29,446 It tells us which possibilities we have to split up a probability.
17 00:01:29,814 –> 00:01:37,196 So as always we choose a probability space given by a sample space Omega, a sigma algebra A and the probability measure P.
18 00:01:37,557 –> 00:01:41,254 and now we want to calculate the probability of a subset A.
19 00:01:41,454 –> 00:01:44,543 So the question is: how can we split up P(A)?
20 00:01:45,271 –> 00:01:48,528 Now, one possibility would be to choose another set B.
21 00:01:48,929 –> 00:01:52,376 In the picture we can also visualize this maybe like this.
22 00:01:52,576 –> 00:01:57,578 So the set B is here and you immediately see how this set A is divided now.
23 00:01:57,886 –> 00:02:01,813 We get this, because we have the set B and the complement of B.
24 00:02:02,457 –> 00:02:07,170 and of course both together in a union gives us the whole sample space Omega.
25 00:02:07,370 –> 00:02:10,866 Important to note here is, this is indeed a disjoint union.
26 00:02:11,543 –> 00:02:15,459 Therefore this division here works no matter which event A we choose.
27 00:02:15,659 –> 00:02:18,569 and this is now what we can put into a formula.
28 00:02:18,769 –> 00:02:23,058 So we have P(A), where we can write A as a disjoint union.
29 00:02:23,500 –> 00:02:29,123 Namely the union of the upper part here we have in B, with the other part we have in B^c.
30 00:02:29,514 –> 00:02:35,367 and because this is a disjoint union, we can use the property of the measure and write it as a sum.
31 00:02:35,743 –> 00:02:40,438 So we have the probability of this intersection + the probability of that intersection.
32 00:02:40,638 –> 00:02:44,957 Now as before an intersection we can rewrite as a conditional probability.
33 00:02:45,557 –> 00:02:49,536 For the first one we write P(A|B) times P(B).
34 00:02:49,736 –> 00:02:54,087 and for the second one we use the same formula, but now with the complement of B.
35 00:02:54,287 –> 00:03:01,410 So you see, this is a nice formula, we can use to calculate the probability of A, when we know these 4 probabilities here.
36 00:03:01,800 –> 00:03:05,717 However the law of total probability goes even further.
37 00:03:05,917 –> 00:03:11,573 We can also deal with the case that we don’t have only one set B, but countably many.
38 00:03:11,957 –> 00:03:15,888 So we could have 2, 3 or even infinitely many.
39 00:03:16,088 –> 00:03:21,553 Hence we simply say we have B_i, where i comes from the index set that is a subset of the natural numbers.
40 00:03:21,943 –> 00:03:25,420 and of course we have to generalize this property here.
41 00:03:25,771 –> 00:03:29,788 This means that the union of all these sets is equal to Omega.
42 00:03:29,988 –> 00:03:33,829 and in addition as before, it needs to be a disjoint union.
43 00:03:34,029 –> 00:03:37,293 Now lets also visualize this in a picture.
44 00:03:37,614 –> 00:03:43,243 There is our sample space Omega again and now we don’t just find one set B, but a lot of them.
45 00:03:43,714 –> 00:03:46,957 For example such a decomposition of Omega could look like this.
46 00:03:47,600 –> 00:03:51,884 and here please don’t forget it’s possible that we have infinitely many sets B_i.
47 00:03:52,084 –> 00:03:57,015 Therefore in this picture they would get thinner and thinner when we go in this direction, for example.
48 00:03:57,215 –> 00:04:00,070 Of course there are a lot possibilities to visualize this.
49 00:04:00,270 –> 00:04:06,138 However the important part here is that we also have a set A of which we want to calculate the probability.
50 00:04:06,600 –> 00:04:10,057 In fact this now works exactly with the same steps as before.
51 00:04:10,686 –> 00:04:14,559 So first we can write a set A as a disjoint union.
52 00:04:14,759 –> 00:04:21,532 Of course this is again, the intersection with the B_i’s. Which means in the picture we just put all these parts here together.
53 00:04:21,732 –> 00:04:27,395 Ok then because it’s a disjoint union, we can use the sigma additivity of the measure.
54 00:04:27,829 –> 00:04:32,502 So we get out this sum for the probabilities or a series when we have infinitely many.
55 00:04:32,702 –> 00:04:36,343 However in both cases we can us the conditional probabilities again.
56 00:04:36,771 –> 00:04:42,029 So we have the sum of P(A|B_i) times the probability of B_i.
57 00:04:42,229 –> 00:04:46,685 Now, this formula here is indeed the general law of total probability.
58 00:04:46,885 –> 00:04:50,608 And how we can apply we will see in the next example.
59 00:04:51,443 –> 00:04:55,639 Actually this is one of the most famous examples of probability theory.
60 00:04:56,043 –> 00:04:58,698 It’s the so called Monty Hall problem.
61 00:04:58,898 –> 00:05:03,922 and because it is so well known, i don’t want to go into the whole history and the details.
62 00:05:04,122 –> 00:05:08,029 We just use it to compute a probability with the 2 laws above.
63 00:05:08,351 –> 00:05:11,635 However I still have to explain how this whole puzzle works.
64 00:05:11,835 –> 00:05:18,436 So we have a game show with 3 doors, where there is one door with a car behind and 2 doors with a goat behind.
65 00:05:18,729 –> 00:05:22,744 and now lets assume that the car would be the better price to win.
66 00:05:23,043 –> 00:05:25,543 Ok then the game works in 3 steps.
67 00:05:26,057 –> 00:05:29,350 First you pick a door. Lets say you pick door 1.
68 00:05:29,550 –> 00:05:33,870 Afterwards in the second step the showmaster always shows you a goat.
69 00:05:34,070 –> 00:05:36,638 So he opens one of the 2 remaining doors.
70 00:05:36,838 –> 00:05:39,771 and maybe lets say here he opens door 3.
71 00:05:40,257 –> 00:05:43,760 and then in the last step you have to do your final pick.
72 00:05:44,043 –> 00:05:47,292 So you can either keep the original door or you can switch.
73 00:05:47,492 –> 00:05:52,373 and now I can already tell you, switching has the higher probability to getting the car.
74 00:05:52,986 –> 00:05:56,688 Therefore if you want the goat you should stay at the original door.
75 00:05:57,000 –> 00:06:00,546 However no matter what you want, we can calculate the probabilities now.
76 00:06:00,986 –> 00:06:04,489 Here please note, the names for the doors are arbitrary.
77 00:06:04,654 –> 00:06:09,542 So we can call the first door we pick just (1).
78 00:06:10,000 –> 00:06:13,486 Moreover we need some names for the events we consider here.
79 00:06:13,670 –> 00:06:17,498 Here c_j should be the event that the car is behind door j.
80 00:06:17,769 –> 00:06:23,450 In addition s_j should be event that in the second step the showmaster opens door j.
81 00:06:23,589 –> 00:06:26,823 Hence we already know some conditional probabilities.
82 00:06:27,099 –> 00:06:30,000 And in order to make our life a little bitte easier,
83 00:06:30,000 –> 00:06:39,069 Let’s only consider the case that the show master opens door 3. The other interesting case with S_2 can be described in the same way.
84 00:06:39,069 –> 00:06:43,967 Okay, so in this case we already some conditional probabilities.
85 00:06:44,864 –> 00:06:50,793 Namely the probability of s_3 under the condition c_3 has to be 0.
86 00:06:51,847 –> 00:06:53,752 The showmaster will never show you the car in the second step.
87 00:06:54,622 –> 00:06:57,450 He always opens a door with a goat.
88 00:06:57,488 –> 00:07:02,222 Therefore we also now the probability of s_3 under the condition c_2.
89 00:07:02,773 –> 00:07:06,511 Because this is what I told you, he opens one of the 2 remaining doors.
90 00:07:06,727 –> 00:07:08,547 Never the door you picked.
91 00:07:08,747 –> 00:07:11,532 So in this scenario here, he does not have a choice.
92 00:07:11,970 –> 00:07:15,247 Hence the conditional probability here is 1.
93 00:07:15,447 –> 00:07:20,973 Then the last remaining case would be where he has a choice. Therefore we say the probability is 1/2.
94 00:07:20,993 –> 00:07:25,707 So you see, just by knowing the problem we already get a lot of information.
95 00:07:26,346 –> 00:07:32,391 And please also note, here we didn’t define a sample space, sigma algebra or even a probability measure yet.
96 00:07:32,736 –> 00:07:39,834 Simply because we don’t need it. We just want to know what happens in any probability space, when we have these conditional probabilities.
97 00:07:40,121 –> 00:07:42,996 Indeed this will be our last step here.
98 00:07:43,643 –> 00:07:49,262 So we want to know: what is the probability of getting the car when I switch the door in the third step.
99 00:07:49,877 –> 00:07:52,422 And this is exactly given by this conditional probability.
100 00:07:52,710 –> 00:07:56,854 Now, maybe not so surprising now we can apply Bayes’s theorem here.
101 00:07:58,462 –> 00:07:59,840 Please recall what it tells us.
102 00:08:00,000 –> 00:08:05,899 We can exchange the order in the conditional probability here, when multiply with the probability of the last part.
103 00:08:06,025 –> 00:08:11,462 However we don’t have it on the left hand side. Therefore we have to divide here by the probability of s_3.
104 00:08:12,025 –> 00:08:15,460 Therefore often you see Bayes’s theorem in this formulation.
105 00:08:16,426 –> 00:08:21,712 Ok, now here on the right-hand side we have a problem, because we don’t know what P(s_3) is.
106 00:08:21,712 –> 00:08:28,132 However we have all the conditional probabilities here. Therefore we can use the law of total probability.
107 00:08:28,813 –> 00:08:32,324 Hence in the denominator we get a sum with 3 parts.
108 00:08:32,482 –> 00:08:38,701 Namely we sum over P(s_3|c_j) times the probability of c_j.
109 00:08:39,244 –> 00:08:42,359 and there we can put in our conditional probabilities.
110 00:08:42,722 –> 00:08:47,571 Ok then lets start in the numerator. This probability here is 1.
111 00:08:47,946 –> 00:08:51,036 and the same we find here in the denominator as well.
112 00:08:51,513 –> 00:08:55,894 Then on the right hand side we find a conditional probability that is 0.
113 00:08:56,445 –> 00:08:59,778 and then the last remaining one on the left is 1/2.
114 00:09:00,000 –> 00:09:05,727 Ok now you see the last ingredient we need would be the probability of c_2 and c_1.
115 00:09:06,134 –> 00:09:10,346 and there of course we have the assumption that it’s the same probability.
116 00:09:10,458 –> 00:09:15,425 So we assume fair game. Each door has the same probability for getting the price.
117 00:09:15,976 –> 00:09:20,083 and by having 3 doors this would mean we have the probability 1/3.
118 00:09:20,808 –> 00:09:25,869 Ok, now we have substituted everything with numbers, such that we can simply compute.
119 00:09:25,869 –> 00:09:27,643 and we get out 2/3.
120 00:09:28,170 –> 00:09:32,819 So indeed we get out the result that in this scenario switching is beneficial.
121 00:09:34,579 –> 00:09:42,339 However of course the goal here was not winning a car, but rather seeing the application of Bayes’s theorem and law of total probability.
122 00:09:42,339 –> 00:09:48,837 Ok with this I think it’s good enough for today and I hope I see you in the next video. Have a nice day and Bye! :) :)
-
Quiz Content
Q1: Let $(\Omega, \mathcal{A}, \mathbb{P})$ be a probability space and $A,B \in \mathcal{A}$. What is the correct formulation for Bayes’s theorem?
A1: $$ \mathbb{P}(A \mid B) \cdot \mathbb{P}(B) = \mathbb{P}(A \mid B)\cdot \mathbb{P}(B) $$
A2: $$ \mathbb{P}(A \mid B) \cdot \mathbb{P}(B) = \mathbb{P}(B \mid B)\cdot \mathbb{P}(B) $$
A3: $$ \mathbb{P}(A \mid B) \cdot \mathbb{P}(B) = \mathbb{P}(B \mid A)\cdot \mathbb{P}(A) $$
A4: $$ \mathbb{P}(A \mid B) \cdot \mathbb{P}(B) = \mathbb{P}(A \mid B)\cdot \mathbb{P}(A) $$
Q2: Let $(\Omega, \mathcal{A}, \mathbb{P})$ be a probability space, $B_1, B_2, B_3 \in \mathcal{A}$ pairwise disjoint with $B_1 \cup B_2 \cup B_3 = \Omega$. What is the correct formulation for the law of total probability?
A1: $$\mathbb{P}(A) = \mathbb{P}(A \mid B_1) \cdot \mathbb{P}(B_1) +\mathbb{P}(A \mid B_2) \cdot \mathbb{P}(B_2) + \mathbb{P}(A \mid B_3) \cdot \mathbb{P}(B_3) $$
A2: $$\mathbb{P}(A) = \mathbb{P}(A \mid B_1) +\mathbb{P}(A \mid B_2) + \mathbb{P}(A \mid B_3) $$
A3: $$\mathbb{P}(A) = \mathbb{P}(B_1) +\mathbb{P}(B_2) + \mathbb{P}(B_3) $$
A4: $$\mathbb{P}(A) = \mathbb{P}(B_1 \mid A) \cdot \mathbb{P}(B_1) +\mathbb{P}(B_2 \mid A) \cdot \mathbb{P}(B_2) + \mathbb{P}(B_3 \mid A) \cdot \mathbb{P}(B_3) $$
Q3: Consider the Monty Hall problem. What is the probability of getting the car if you don’t switch doors in the last step?
A1: $1$
A2: $\frac{1}{2}$
A3: $\frac{1}{3}$
A4: $\frac{2}{3}$
A5: $0$
-
Last update: 2024-10