# Information about Probability Theory - Part 9

• Title: Independence for Events

• Series: Probability Theory

• Bright video: https://youtu.be/4GfcBpoibig

• Dark video: https://youtu.be/Yqcz2e4m5A8

• Timestamps 00:00 Intro 00:19 Visualization (Independence for events) 03:48 Definition of independence 04:52: Example (discrete case) 07:38 Continuous case 10:52 Outro
• Subtitle in English

1 00:00:00,371 –> 00:00:03,455 Hello and welcome back to probability theory.

2 00:00:03,800 –> 00:00:09,807 and as always the first thing i want to do is to thank all the nice people that support this channel on Steady or Paypal.

3 00:00:10,007 –> 00:00:14,186 Today in part 9 we will talk about the important concept of independence.

4 00:00:14,729 –> 00:00:18,363 More precisely this video is about the independence for events.

5 00:00:19,000 –> 00:00:24,105 For this please recall we can visualize events as subsets in the sample space, Omega.

6 00:00:24,305 –> 00:00:27,835 So here you see an event A and an event B.

7 00:00:28,035 –> 00:00:33,040 Now, in some sense independence should mean, that the 2 events don’t affect each other.

8 00:00:33,240 –> 00:00:38,572 Whatever this means, in this example it’s not clear at all, because we have an intersection.

9 00:00:39,043 –> 00:00:42,174 Therefore lets first look at a simpler example.

10 00:00:42,614 –> 00:00:46,178 So here we have 2 sets that are clearly disjoint.

11 00:00:46,378 –> 00:00:51,457 Hence they don’t have to do anything with each other. So we call them mutually exclusive.

12 00:00:51,657 –> 00:00:55,701 However the question is: would we also call them independent?

13 00:00:56,200 –> 00:01:01,794 In general we won’t do that and in order to see this we have to think of probabilities.

14 00:01:02,214 –> 00:01:08,714 Namely, when we calculate the probability that A occurs under the condition that B occurs,

15 00:01:09,343 –> 00:01:12,421 then this should be the probability of A itself.

16 00:01:12,843 –> 00:01:17,349 In other words the event B should not effect the probability of A.

17 00:01:17,549 –> 00:01:23,914 So please remember: independence should mean that the knowledge of B does not make A less or more likely.

18 00:01:24,629 –> 00:01:28,093 and of course this should also holds the other way around.

19 00:01:28,757 –> 00:01:34,212 So the conditional probability of B under A should be the same as the probability of B.

20 00:01:34,886 –> 00:01:41,643 Now looking at the examples we see it’s not fulfilled for the second example and not clear at all for the first example.

21 00:01:42,029 –> 00:01:48,295 For this please recall, the conditional probability here means: what is the ratio of A inside B?

22 00:01:48,686 –> 00:01:53,756 And then the property tells us: it should be the same as the ratio of A inside Omega.

23 00:01:54,071 –> 00:01:57,951 and then of course, we also have to check this equality for B.

24 00:01:58,151 –> 00:02:03,376 Therefore i would say lets visualize this in an example, where we have some numbers to work with.

25 00:02:03,943 –> 00:02:07,567 So as before we have a sample space Omega and 2 events.

26 00:02:07,767 –> 00:02:11,293 and maybe lets choose 1/2 of Omega to be the set A.

27 00:02:11,493 –> 00:02:17,651 Of course, please recall this is only the visualization for the probability of A being 1/2.

28 00:02:17,851 –> 00:02:23,135 This means that we measure the sizes of the sets in the picture with the probability measure P.

29 00:02:23,335 –> 00:02:28,149 and now our idea is that we also split Omega in half with the set B.

30 00:02:28,557 –> 00:02:32,329 Hence the probability of B should also be 1/2.

31 00:02:32,529 –> 00:02:37,480 However, now the important question is: what can we say about the conditional probabilities?

32 00:02:37,829 –> 00:02:41,214 So first what is the ratio of A in B?

33 00:02:41,471 –> 00:02:45,384 and of course in the picture you immediately see it’s also 1/2.

34 00:02:45,584 –> 00:02:49,943 and on the other hand we ask: what is the ratio of B in A?

35 00:02:50,143 –> 00:02:53,680 and in this case we see it’s the same number, it’s 1/2.

36 00:02:54,171 –> 00:02:58,589 Hence indeed we have the equalities here, so the events are independent.

37 00:02:59,143 –> 00:03:04,070 Ok, i think we have seen enough visualizations and we can start formulating the definition.

38 00:03:04,571 –> 00:03:08,579 and for this it’s helpful to recall the definition of the conditional probability.

39 00:03:08,900 –> 00:03:15,729 So P(A|B) was given as P of the intersection divided by P(B).

40 00:03:16,243 –> 00:03:21,552 and similarly P(B|A) is given by the intersection divided by P(A).

41 00:03:22,043 –> 00:03:24,381 Now, for the independence we want 2 things.

42 00:03:24,581 –> 00:03:30,515 First this conditional probability should be equal to P(A) and this one should be equal to P(B).

43 00:03:30,715 –> 00:03:37,488 and there you should see, if we multiply this denominator to the other side, we have the same equality.

44 00:03:38,014 –> 00:03:41,304 Namely the intersection is given by the product.

45 00:03:41,504 –> 00:03:45,645 and this is now what we can choose as a defining property for independence.

46 00:03:45,845 –> 00:03:48,515 So it’s short and easy to remember.

47 00:03:48,715 –> 00:03:51,946 Ok, then lets put this into a definition.

48 00:03:52,429 –> 00:03:55,786 So the framework is the same as always. We have a probability space.

49 00:03:55,829 –> 00:04:00,314 Which means a sample space Omega, a sigma Algebra A and the probability measure P.

50 00:04:00,871 –> 00:04:07,343 and now we have learned, we call 2 events A, B from the sigma Algebra independent if this equality holds.

51 00:04:07,929 –> 00:04:12,815 Then with this it turns out, we don’t have to restrict ourself to 2 events.

52 00:04:13,015 –> 00:04:16,689 We can just consider a whole family of events A_i.

53 00:04:16,889 –> 00:04:21,543 and indeed we could have infinitely many when this index set “i” is infinite.

54 00:04:21,991 –> 00:04:28,139 and then also here the whole family is called independent if it sends the intersection to a product.

55 00:04:28,339 –> 00:04:34,466 More concretely we have P of this big intersection is equal to the product of the single probabilities.

56 00:04:34,786 –> 00:04:39,741 and this should hold no matter which finite subset of the index set “i” we choose.

57 00:04:40,371 –> 00:04:44,399 In other words in this formula here only finitely many sets A_j go in.

58 00:04:44,886 –> 00:04:48,667 However you also see we need to check all possibilities here.

59 00:04:48,971 –> 00:04:52,176 Maybe the only thing you can ignore would be the empty set.

60 00:04:52,586 –> 00:04:56,374 Ok, then lets look what this means for a real world example.

61 00:04:56,829 –> 00:05:00,470 So lets take an ordinary die and lets throw it 2 times.

62 00:05:01,043 –> 00:05:05,429 and then we look at the outcome with order. So we have a first throw and a second throw.

63 00:05:06,129 –> 00:05:09,571 Hence our whole probability space has to represent this.

64 00:05:09,824 –> 00:05:15,198 So for the sample space Omega we have the cartesian product of the set 1 to 6 with itself.

65 00:05:15,398 –> 00:05:21,801 Then in the next step for the sigma Algebra as always in the discrete case, we just have the power set of Omega.

66 00:05:22,001 –> 00:05:27,386 and finally for the probability measure P we have something we call the uniform distribution.

67 00:05:27,771 –> 00:05:33,718 This is not complicated at all, it just tells us that every element in Omega gets the same probability.

68 00:05:33,918 –> 00:05:37,900 Or to put it in other words. The probability mass function is constant.

69 00:05:38,443 –> 00:05:43,587 So in our case we have 36 elements in Omega. So we need 1 over 36.

70 00:05:43,787 –> 00:05:48,362 Ok so this is the probability space and now we can look at some events.

71 00:05:48,562 –> 00:05:52,815 In the first step i would suggest lets describe the events with words.

72 00:05:53,015 –> 00:05:58,262 “A” should be the event that the first throw shows a 6 and the second throw can show anything.

73 00:05:58,462 –> 00:06:04,251 On the other hand B should be the event where we look at both throws and the sum of the numbers is 7.

74 00:06:04,451 –> 00:06:08,985 Ok, then in the next step lets translate these descriptions into sets.

75 00:06:09,400 –> 00:06:15,131 For the first one we just have a sample omega_1, omega_2, where omega_1 is equal to 6.

76 00:06:15,629 –> 00:06:23,665 Then for the set B, we also have the set of all samples omega_1, omega_2. Where omega_1 + omega_2 should be 7.

77 00:06:24,229 –> 00:06:28,130 Ok since the sets are defined we can calculate the probabilities.

78 00:06:28,330 –> 00:06:33,419 Of course without much thinking we immediately get the probability for A, as 1 over 6.

79 00:06:33,619 –> 00:06:40,664 But you can also calculate it when you see we have 6 elements in this set. So we have 6 divided by 36.

80 00:06:40,864 –> 00:06:45,026 The same thing we can do for B, because we can list all the elements.

81 00:06:45,226 –> 00:06:50,011 So we could start with the element, where the first throw is 1 and the second 6.

82 00:06:50,211 –> 00:06:54,466 Then in the same way, if the first throw is 2 the second has to be 5.

83 00:06:54,666 –> 00:06:57,970 and in this order we get all the possibilities.

84 00:06:58,271 –> 00:07:02,323 and we count 6. So we have also 6 divided by 36.

85 00:07:03,257 –> 00:07:08,199 So both events have the same probability, but this does not mean that they are independent.

86 00:07:08,399 –> 00:07:12,891 However now using the definition we can just check if they are independent.

87 00:07:13,329 –> 00:07:17,007 So lets simply calculate the probability of the intersection.

88 00:07:17,207 –> 00:07:22,686 So from all the elements in B we have to take the ones that have a 6 at the first throw.

89 00:07:23,186 –> 00:07:25,536 and you see we only find 1.

90 00:07:25,900 –> 00:07:28,943 Hence the probability is 1 over 36.

91 00:07:29,343 –> 00:07:33,553 Which is in fact the same as the product P(A) times P(B).

92 00:07:33,753 –> 00:07:37,636 Therefore our 2 events A and B are independent.

93 00:07:38,257 –> 00:07:42,439 Ok maybe it’s a good idea to look also at a continuous example.

94 00:07:42,639 –> 00:07:46,977 So here we take the unit interval and throw a point into it.

95 00:07:47,243 –> 00:07:51,731 Indeed this is a typical and simple absolutely continuous example.

96 00:07:51,986 –> 00:07:55,477 and i think you already know how to choose the probability space.

97 00:07:55,677 –> 00:07:59,349 Of course the sample space Omega should be the unit interval.

98 00:07:59,714 –> 00:08:02,719 and the sigma algebra is the Borel sigma algebra.

99 00:08:03,243 –> 00:08:07,317 Finally the probability measure we also call the uniform distribution.

100 00:08:07,517 –> 00:08:15,277 However now we have it in the continuous case, which means instead of a probability mass function we have a probability density function.

101 00:08:15,477 –> 00:08:19,483 Hence this uniform now means, we have a constant density function.

102 00:08:19,683 –> 00:08:23,680 and here because we have the unit interval, we have the constant 1.

103 00:08:23,880 –> 00:08:32,961 As a reminder this means when we want to measure the probability of an interval [a,b], we have the integral over this interval of the density function.

104 00:08:33,343 –> 00:08:36,049 Which should be “b-a”.

105 00:08:36,471 –> 00:08:42,654 However of course this only makes sense if we choose b greater than “a” and a,b in Omega.

106 00:08:42,854 –> 00:08:48,225 I say this because often you see that the whole real number line is taken as the sample space

107 00:08:48,425 –> 00:08:53,170 and then of course in order to have the same thing, we have to change the density function.

108 00:08:53,370 –> 00:08:58,339 Indeed this is what we then call the indicator function of the unit interval.

109 00:08:58,539 –> 00:09:02,194 It’s written as such a 1, where a set is in the index.

110 00:09:02,394 –> 00:09:06,954 The definition is very simple. We only have 2 values, either 1 or 0.

111 00:09:07,154 –> 00:09:13,248 and the important thing is: for all elements in this set here, so the unit interval we are at 1.

112 00:09:13,448 –> 00:09:15,733 and otherwise we get out 0.

113 00:09:16,200 –> 00:09:21,373 Now in the case that the function is defined on the whole real number line, the graph looks like this.

114 00:09:21,573 –> 00:09:24,614 So we have one jump at 0 and one at 1.

115 00:09:25,600 –> 00:09:31,220 So you see, taking this as the density function, the sample space R would also work.

116 00:09:31,257 –> 00:09:35,486 However this is just a side note, lets get back to our problem of independence.

117 00:09:35,686 –> 00:09:39,231 So lets take 2 independent events A and B.

118 00:09:39,431 –> 00:09:43,357 Then we now by definition that the probabilities fulfill this equality.

119 00:09:43,629 –> 00:09:47,257 Now lets use our probability measure and write the integral.

120 00:09:47,586 –> 00:09:53,829 So we have the intersection A,B as the region we integrate over and the indicator function inside.

121 00:09:54,029 –> 00:09:58,906 Here a nice property of the integral we can use is that we can exchange the sets.

122 00:09:59,106 –> 00:10:02,766 So we have the indicator function of the set A intersected with B now

123 00:10:02,786 –> 00:10:05,641 and [0,1] is the region we integrate over.

124 00:10:06,343 –> 00:10:10,259 If you know the definition of the integral, you immediately see that this is correct.

125 00:10:10,459 –> 00:10:14,007 For probability theory this is something we can use a lot.

126 00:10:14,207 –> 00:10:17,481 For example we can also use it here on the right hand side.

127 00:10:17,900 –> 00:10:21,369 So first lets fill in the definition of the probabilities here.

128 00:10:22,014 –> 00:10:25,152 and then as before we can simply exchange the roles.

129 00:10:25,643 –> 00:10:29,157 So what we get here is an equality of integrals.

130 00:10:29,473 –> 00:10:34,936 This integral here on the left hand side with the intersection goes to a product of 2 integrals.

131 00:10:35,136 –> 00:10:39,464 However, please recall we really need the independence to have this.

132 00:10:39,900 –> 00:10:44,125 So with this you see independence can be very useful in calculations.

133 00:10:44,325 –> 00:10:47,204 and of course we will use that in later videos.

134 00:10:47,543 –> 00:10:52,243 Therefore i really hope i see you in the next video about probability theory.

135 00:10:52,443 –> 00:10:54,614 Have a nice day and Bye!

• Back to overview page