
Title: Independence for Events

Series: Probability Theory

Bright video: https://youtu.be/4GfcBpoibig

Dark video: https://youtu.be/Yqcz2e4m5A8

Quiz: Test your knowledge

Thumbnail (bright): Download PNG

Timestamps
00:00 Intro 00:19 Visualization (Independence for events) 03:48 Definition of independence 04:52: Example (discrete case) 07:38 Continuous case 10:52 Outro 
Subtitle in English
1 00:00:00,371 –> 00:00:03,455 Hello and welcome back to probability theory.
2 00:00:03,800 –> 00:00:09,807 and as always the first thing i want to do is to thank all the nice people that support this channel on Steady or Paypal.
3 00:00:10,007 –> 00:00:14,186 Today in part 9 we will talk about the important concept of independence.
4 00:00:14,729 –> 00:00:18,363 More precisely this video is about the independence for events.
5 00:00:19,000 –> 00:00:24,105 For this please recall we can visualize events as subsets in the sample space, Omega.
6 00:00:24,305 –> 00:00:27,835 So here you see an event A and an event B.
7 00:00:28,035 –> 00:00:33,040 Now, in some sense independence should mean, that the 2 events don’t affect each other.
8 00:00:33,240 –> 00:00:38,572 Whatever this means, in this example it’s not clear at all, because we have an intersection.
9 00:00:39,043 –> 00:00:42,174 Therefore lets first look at a simpler example.
10 00:00:42,614 –> 00:00:46,178 So here we have 2 sets that are clearly disjoint.
11 00:00:46,378 –> 00:00:51,457 Hence they don’t have to do anything with each other. So we call them mutually exclusive.
12 00:00:51,657 –> 00:00:55,701 However the question is: would we also call them independent?
13 00:00:56,200 –> 00:01:01,794 In general we won’t do that and in order to see this we have to think of probabilities.
14 00:01:02,214 –> 00:01:08,714 Namely, when we calculate the probability that A occurs under the condition that B occurs,
15 00:01:09,343 –> 00:01:12,421 then this should be the probability of A itself.
16 00:01:12,843 –> 00:01:17,349 In other words the event B should not effect the probability of A.
17 00:01:17,549 –> 00:01:23,914 So please remember: independence should mean that the knowledge of B does not make A less or more likely.
18 00:01:24,629 –> 00:01:28,093 and of course this should also holds the other way around.
19 00:01:28,757 –> 00:01:34,212 So the conditional probability of B under A should be the same as the probability of B.
20 00:01:34,886 –> 00:01:41,643 Now looking at the examples we see it’s not fulfilled for the second example and not clear at all for the first example.
21 00:01:42,029 –> 00:01:48,295 For this please recall, the conditional probability here means: what is the ratio of A inside B?
22 00:01:48,686 –> 00:01:53,756 And then the property tells us: it should be the same as the ratio of A inside Omega.
23 00:01:54,071 –> 00:01:57,951 and then of course, we also have to check this equality for B.
24 00:01:58,151 –> 00:02:03,376 Therefore i would say lets visualize this in an example, where we have some numbers to work with.
25 00:02:03,943 –> 00:02:07,567 So as before we have a sample space Omega and 2 events.
26 00:02:07,767 –> 00:02:11,293 and maybe lets choose 1/2 of Omega to be the set A.
27 00:02:11,493 –> 00:02:17,651 Of course, please recall this is only the visualization for the probability of A being 1/2.
28 00:02:17,851 –> 00:02:23,135 This means that we measure the sizes of the sets in the picture with the probability measure P.
29 00:02:23,335 –> 00:02:28,149 and now our idea is that we also split Omega in half with the set B.
30 00:02:28,557 –> 00:02:32,329 Hence the probability of B should also be 1/2.
31 00:02:32,529 –> 00:02:37,480 However, now the important question is: what can we say about the conditional probabilities?
32 00:02:37,829 –> 00:02:41,214 So first what is the ratio of A in B?
33 00:02:41,471 –> 00:02:45,384 and of course in the picture you immediately see it’s also 1/2.
34 00:02:45,584 –> 00:02:49,943 and on the other hand we ask: what is the ratio of B in A?
35 00:02:50,143 –> 00:02:53,680 and in this case we see it’s the same number, it’s 1/2.
36 00:02:54,171 –> 00:02:58,589 Hence indeed we have the equalities here, so the events are independent.
37 00:02:59,143 –> 00:03:04,070 Ok, i think we have seen enough visualizations and we can start formulating the definition.
38 00:03:04,571 –> 00:03:08,579 and for this it’s helpful to recall the definition of the conditional probability.
39 00:03:08,900 –> 00:03:15,729 So P(AB) was given as P of the intersection divided by P(B).
40 00:03:16,243 –> 00:03:21,552 and similarly P(BA) is given by the intersection divided by P(A).
41 00:03:22,043 –> 00:03:24,381 Now, for the independence we want 2 things.
42 00:03:24,581 –> 00:03:30,515 First this conditional probability should be equal to P(A) and this one should be equal to P(B).
43 00:03:30,715 –> 00:03:37,488 and there you should see, if we multiply this denominator to the other side, we have the same equality.
44 00:03:38,014 –> 00:03:41,304 Namely the intersection is given by the product.
45 00:03:41,504 –> 00:03:45,645 and this is now what we can choose as a defining property for independence.
46 00:03:45,845 –> 00:03:48,515 So it’s short and easy to remember.
47 00:03:48,715 –> 00:03:51,946 Ok, then lets put this into a definition.
48 00:03:52,429 –> 00:03:55,786 So the framework is the same as always. We have a probability space.
49 00:03:55,829 –> 00:04:00,314 Which means a sample space Omega, a sigma Algebra A and the probability measure P.
50 00:04:00,871 –> 00:04:07,343 and now we have learned, we call 2 events A, B from the sigma Algebra independent if this equality holds.
51 00:04:07,929 –> 00:04:12,815 Then with this it turns out, we don’t have to restrict ourself to 2 events.
52 00:04:13,015 –> 00:04:16,689 We can just consider a whole family of events A_i.
53 00:04:16,889 –> 00:04:21,543 and indeed we could have infinitely many when this index set “i” is infinite.
54 00:04:21,991 –> 00:04:28,139 and then also here the whole family is called independent if it sends the intersection to a product.
55 00:04:28,339 –> 00:04:34,466 More concretely we have P of this big intersection is equal to the product of the single probabilities.
56 00:04:34,786 –> 00:04:39,741 and this should hold no matter which finite subset of the index set “i” we choose.
57 00:04:40,371 –> 00:04:44,399 In other words in this formula here only finitely many sets A_j go in.
58 00:04:44,886 –> 00:04:48,667 However you also see we need to check all possibilities here.
59 00:04:48,971 –> 00:04:52,176 Maybe the only thing you can ignore would be the empty set.
60 00:04:52,586 –> 00:04:56,374 Ok, then lets look what this means for a real world example.
61 00:04:56,829 –> 00:05:00,470 So lets take an ordinary die and lets throw it 2 times.
62 00:05:01,043 –> 00:05:05,429 and then we look at the outcome with order. So we have a first throw and a second throw.
63 00:05:06,129 –> 00:05:09,571 Hence our whole probability space has to represent this.
64 00:05:09,824 –> 00:05:15,198 So for the sample space Omega we have the cartesian product of the set 1 to 6 with itself.
65 00:05:15,398 –> 00:05:21,801 Then in the next step for the sigma Algebra as always in the discrete case, we just have the power set of Omega.
66 00:05:22,001 –> 00:05:27,386 and finally for the probability measure P we have something we call the uniform distribution.
67 00:05:27,771 –> 00:05:33,718 This is not complicated at all, it just tells us that every element in Omega gets the same probability.
68 00:05:33,918 –> 00:05:37,900 Or to put it in other words. The probability mass function is constant.
69 00:05:38,443 –> 00:05:43,587 So in our case we have 36 elements in Omega. So we need 1 over 36.
70 00:05:43,787 –> 00:05:48,362 Ok so this is the probability space and now we can look at some events.
71 00:05:48,562 –> 00:05:52,815 In the first step i would suggest lets describe the events with words.
72 00:05:53,015 –> 00:05:58,262 “A” should be the event that the first throw shows a 6 and the second throw can show anything.
73 00:05:58,462 –> 00:06:04,251 On the other hand B should be the event where we look at both throws and the sum of the numbers is 7.
74 00:06:04,451 –> 00:06:08,985 Ok, then in the next step lets translate these descriptions into sets.
75 00:06:09,400 –> 00:06:15,131 For the first one we just have a sample omega_1, omega_2, where omega_1 is equal to 6.
76 00:06:15,629 –> 00:06:23,665 Then for the set B, we also have the set of all samples omega_1, omega_2. Where omega_1 + omega_2 should be 7.
77 00:06:24,229 –> 00:06:28,130 Ok since the sets are defined we can calculate the probabilities.
78 00:06:28,330 –> 00:06:33,419 Of course without much thinking we immediately get the probability for A, as 1 over 6.
79 00:06:33,619 –> 00:06:40,664 But you can also calculate it when you see we have 6 elements in this set. So we have 6 divided by 36.
80 00:06:40,864 –> 00:06:45,026 The same thing we can do for B, because we can list all the elements.
81 00:06:45,226 –> 00:06:50,011 So we could start with the element, where the first throw is 1 and the second 6.
82 00:06:50,211 –> 00:06:54,466 Then in the same way, if the first throw is 2 the second has to be 5.
83 00:06:54,666 –> 00:06:57,970 and in this order we get all the possibilities.
84 00:06:58,271 –> 00:07:02,323 and we count 6. So we have also 6 divided by 36.
85 00:07:03,257 –> 00:07:08,199 So both events have the same probability, but this does not mean that they are independent.
86 00:07:08,399 –> 00:07:12,891 However now using the definition we can just check if they are independent.
87 00:07:13,329 –> 00:07:17,007 So lets simply calculate the probability of the intersection.
88 00:07:17,207 –> 00:07:22,686 So from all the elements in B we have to take the ones that have a 6 at the first throw.
89 00:07:23,186 –> 00:07:25,536 and you see we only find 1.
90 00:07:25,900 –> 00:07:28,943 Hence the probability is 1 over 36.
91 00:07:29,343 –> 00:07:33,553 Which is in fact the same as the product P(A) times P(B).
92 00:07:33,753 –> 00:07:37,636 Therefore our 2 events A and B are independent.
93 00:07:38,257 –> 00:07:42,439 Ok maybe it’s a good idea to look also at a continuous example.
94 00:07:42,639 –> 00:07:46,977 So here we take the unit interval and throw a point into it.
95 00:07:47,243 –> 00:07:51,731 Indeed this is a typical and simple absolutely continuous example.
96 00:07:51,986 –> 00:07:55,477 and i think you already know how to choose the probability space.
97 00:07:55,677 –> 00:07:59,349 Of course the sample space Omega should be the unit interval.
98 00:07:59,714 –> 00:08:02,719 and the sigma algebra is the Borel sigma algebra.
99 00:08:03,243 –> 00:08:07,317 Finally the probability measure we also call the uniform distribution.
100 00:08:07,517 –> 00:08:15,277 However now we have it in the continuous case, which means instead of a probability mass function we have a probability density function.
101 00:08:15,477 –> 00:08:19,483 Hence this uniform now means, we have a constant density function.
102 00:08:19,683 –> 00:08:23,680 and here because we have the unit interval, we have the constant 1.
103 00:08:23,880 –> 00:08:32,961 As a reminder this means when we want to measure the probability of an interval [a,b], we have the integral over this interval of the density function.
104 00:08:33,343 –> 00:08:36,049 Which should be “ba”.
105 00:08:36,471 –> 00:08:42,654 However of course this only makes sense if we choose b greater than “a” and a,b in Omega.
106 00:08:42,854 –> 00:08:48,225 I say this because often you see that the whole real number line is taken as the sample space
107 00:08:48,425 –> 00:08:53,170 and then of course in order to have the same thing, we have to change the density function.
108 00:08:53,370 –> 00:08:58,339 Indeed this is what we then call the indicator function of the unit interval.
109 00:08:58,539 –> 00:09:02,194 It’s written as such a 1, where a set is in the index.
110 00:09:02,394 –> 00:09:06,954 The definition is very simple. We only have 2 values, either 1 or 0.
111 00:09:07,154 –> 00:09:13,248 and the important thing is: for all elements in this set here, so the unit interval we are at 1.
112 00:09:13,448 –> 00:09:15,733 and otherwise we get out 0.
113 00:09:16,200 –> 00:09:21,373 Now in the case that the function is defined on the whole real number line, the graph looks like this.
114 00:09:21,573 –> 00:09:24,614 So we have one jump at 0 and one at 1.
115 00:09:25,600 –> 00:09:31,220 So you see, taking this as the density function, the sample space R would also work.
116 00:09:31,257 –> 00:09:35,486 However this is just a side note, lets get back to our problem of independence.
117 00:09:35,686 –> 00:09:39,231 So lets take 2 independent events A and B.
118 00:09:39,431 –> 00:09:43,357 Then we now by definition that the probabilities fulfill this equality.
119 00:09:43,629 –> 00:09:47,257 Now lets use our probability measure and write the integral.
120 00:09:47,586 –> 00:09:53,829 So we have the intersection A,B as the region we integrate over and the indicator function inside.
121 00:09:54,029 –> 00:09:58,906 Here a nice property of the integral we can use is that we can exchange the sets.
122 00:09:59,106 –> 00:10:02,766 So we have the indicator function of the set A intersected with B now
123 00:10:02,786 –> 00:10:05,641 and [0,1] is the region we integrate over.
124 00:10:06,343 –> 00:10:10,259 If you know the definition of the integral, you immediately see that this is correct.
125 00:10:10,459 –> 00:10:14,007 For probability theory this is something we can use a lot.
126 00:10:14,207 –> 00:10:17,481 For example we can also use it here on the right hand side.
127 00:10:17,900 –> 00:10:21,369 So first lets fill in the definition of the probabilities here.
128 00:10:22,014 –> 00:10:25,152 and then as before we can simply exchange the roles.
129 00:10:25,643 –> 00:10:29,157 So what we get here is an equality of integrals.
130 00:10:29,473 –> 00:10:34,936 This integral here on the left hand side with the intersection goes to a product of 2 integrals.
131 00:10:35,136 –> 00:10:39,464 However, please recall we really need the independence to have this.
132 00:10:39,900 –> 00:10:44,125 So with this you see independence can be very useful in calculations.
133 00:10:44,325 –> 00:10:47,204 and of course we will use that in later videos.
134 00:10:47,543 –> 00:10:52,243 Therefore i really hope i see you in the next video about probability theory.
135 00:10:52,443 –> 00:10:54,614 Have a nice day and Bye!