• Title: Independence for Events

  • Series: Probability Theory

  • YouTube-Title: Probability Theory 9 | Independence for Events

  • Bright video: https://youtu.be/4GfcBpoibig

  • Dark video: https://youtu.be/Yqcz2e4m5A8

  • Ad-free video: Watch Vimeo video

  • Quiz: Test your knowledge

  • PDF: Download PDF version of the bright video

  • Dark-PDF: Download PDF version of the dark video

  • Print-PDF: Download printable PDF version

  • Exercise Download PDF sheets

  • Thumbnail (bright): Download PNG

  • Thumbnail (dark): Download PNG

  • Subtitle on GitHub: pt09_sub_eng.srt

  • Timestamps

    00:00 Intro

    00:19 Visualization (Independence for events)

    03:48 Definition of independence

    04:52: Example (discrete case)

    07:38 Continuous case

    10:52 Outro

  • Subtitle in English

    1 00:00:00,371 –> 00:00:03,455 Hello and welcome back to probability theory.

    2 00:00:03,800 –> 00:00:09,807 and as always the first thing i want to do is to thank all the nice people that support this channel on Steady or Paypal.

    3 00:00:10,007 –> 00:00:14,186 Today in part 9 we will talk about the important concept of independence.

    4 00:00:14,729 –> 00:00:18,363 More precisely this video is about the independence for events.

    5 00:00:19,000 –> 00:00:24,105 For this please recall we can visualize events as subsets in the sample space, Omega.

    6 00:00:24,305 –> 00:00:27,835 So here you see an event A and an event B.

    7 00:00:28,035 –> 00:00:33,040 Now, in some sense independence should mean, that the 2 events don’t affect each other.

    8 00:00:33,240 –> 00:00:38,572 Whatever this means, in this example it’s not clear at all, because we have an intersection.

    9 00:00:39,043 –> 00:00:42,174 Therefore lets first look at a simpler example.

    10 00:00:42,614 –> 00:00:46,178 So here we have 2 sets that are clearly disjoint.

    11 00:00:46,378 –> 00:00:51,457 Hence they don’t have to do anything with each other. So we call them mutually exclusive.

    12 00:00:51,657 –> 00:00:55,701 However the question is: would we also call them independent?

    13 00:00:56,200 –> 00:01:01,794 In general we won’t do that and in order to see this we have to think of probabilities.

    14 00:01:02,214 –> 00:01:08,714 Namely, when we calculate the probability that A occurs under the condition that B occurs,

    15 00:01:09,343 –> 00:01:12,421 then this should be the probability of A itself.

    16 00:01:12,843 –> 00:01:17,349 In other words the event B should not effect the probability of A.

    17 00:01:17,549 –> 00:01:23,914 So please remember: independence should mean that the knowledge of B does not make A less or more likely.

    18 00:01:24,629 –> 00:01:28,093 and of course this should also holds the other way around.

    19 00:01:28,757 –> 00:01:34,212 So the conditional probability of B under A should be the same as the probability of B.

    20 00:01:34,886 –> 00:01:41,643 Now looking at the examples we see it’s not fulfilled for the second example and not clear at all for the first example.

    21 00:01:42,029 –> 00:01:48,295 For this please recall, the conditional probability here means: what is the ratio of A inside B?

    22 00:01:48,686 –> 00:01:53,756 And then the property tells us: it should be the same as the ratio of A inside Omega.

    23 00:01:54,071 –> 00:01:57,951 and then of course, we also have to check this equality for B.

    24 00:01:58,151 –> 00:02:03,376 Therefore i would say lets visualize this in an example, where we have some numbers to work with.

    25 00:02:03,943 –> 00:02:07,567 So as before we have a sample space Omega and 2 events.

    26 00:02:07,767 –> 00:02:11,293 and maybe lets choose 1/2 of Omega to be the set A.

    27 00:02:11,493 –> 00:02:17,651 Of course, please recall this is only the visualization for the probability of A being 1/2.

    28 00:02:17,851 –> 00:02:23,135 This means that we measure the sizes of the sets in the picture with the probability measure P.

    29 00:02:23,335 –> 00:02:28,149 and now our idea is that we also split Omega in half with the set B.

    30 00:02:28,557 –> 00:02:32,329 Hence the probability of B should also be 1/2.

    31 00:02:32,529 –> 00:02:37,480 However, now the important question is: what can we say about the conditional probabilities?

    32 00:02:37,829 –> 00:02:41,214 So first what is the ratio of A in B?

    33 00:02:41,471 –> 00:02:45,384 and of course in the picture you immediately see it’s also 1/2.

    34 00:02:45,584 –> 00:02:49,943 and on the other hand we ask: what is the ratio of B in A?

    35 00:02:50,143 –> 00:02:53,680 and in this case we see it’s the same number, it’s 1/2.

    36 00:02:54,171 –> 00:02:58,589 Hence indeed we have the equalities here, so the events are independent.

    37 00:02:59,143 –> 00:03:04,070 Ok, i think we have seen enough visualizations and we can start formulating the definition.

    38 00:03:04,571 –> 00:03:08,579 and for this it’s helpful to recall the definition of the conditional probability.

    39 00:03:08,900 –> 00:03:15,729 So P(A|B) was given as P of the intersection divided by P(B).

    40 00:03:16,243 –> 00:03:21,552 and similarly P(B|A) is given by the intersection divided by P(A).

    41 00:03:22,043 –> 00:03:24,381 Now, for the independence we want 2 things.

    42 00:03:24,581 –> 00:03:30,515 First this conditional probability should be equal to P(A) and this one should be equal to P(B).

    43 00:03:30,715 –> 00:03:37,488 and there you should see, if we multiply this denominator to the other side, we have the same equality.

    44 00:03:38,014 –> 00:03:41,304 Namely the intersection is given by the product.

    45 00:03:41,504 –> 00:03:45,645 and this is now what we can choose as a defining property for independence.

    46 00:03:45,845 –> 00:03:48,515 So it’s short and easy to remember.

    47 00:03:48,715 –> 00:03:51,946 Ok, then lets put this into a definition.

    48 00:03:52,429 –> 00:03:55,786 So the framework is the same as always. We have a probability space.

    49 00:03:55,829 –> 00:04:00,314 Which means a sample space Omega, a sigma Algebra A and the probability measure P.

    50 00:04:00,871 –> 00:04:07,343 and now we have learned, we call 2 events A, B from the sigma Algebra independent if this equality holds.

    51 00:04:07,929 –> 00:04:12,815 Then with this it turns out, we don’t have to restrict ourself to 2 events.

    52 00:04:13,015 –> 00:04:16,689 We can just consider a whole family of events A_i.

    53 00:04:16,889 –> 00:04:21,543 and indeed we could have infinitely many when this index set “i” is infinite.

    54 00:04:21,991 –> 00:04:28,139 and then also here the whole family is called independent if it sends the intersection to a product.

    55 00:04:28,339 –> 00:04:34,466 More concretely we have P of this big intersection is equal to the product of the single probabilities.

    56 00:04:34,786 –> 00:04:39,741 and this should hold no matter which finite subset of the index set “i” we choose.

    57 00:04:40,371 –> 00:04:44,399 In other words in this formula here only finitely many sets A_j go in.

    58 00:04:44,886 –> 00:04:48,667 However you also see we need to check all possibilities here.

    59 00:04:48,971 –> 00:04:52,176 Maybe the only thing you can ignore would be the empty set.

    60 00:04:52,586 –> 00:04:56,374 Ok, then lets look what this means for a real world example.

    61 00:04:56,829 –> 00:05:00,470 So lets take an ordinary die and lets throw it 2 times.

    62 00:05:01,043 –> 00:05:05,429 and then we look at the outcome with order. So we have a first throw and a second throw.

    63 00:05:06,129 –> 00:05:09,571 Hence our whole probability space has to represent this.

    64 00:05:09,824 –> 00:05:15,198 So for the sample space Omega we have the cartesian product of the set 1 to 6 with itself.

    65 00:05:15,398 –> 00:05:21,801 Then in the next step for the sigma Algebra as always in the discrete case, we just have the power set of Omega.

    66 00:05:22,001 –> 00:05:27,386 and finally for the probability measure P we have something we call the uniform distribution.

    67 00:05:27,771 –> 00:05:33,718 This is not complicated at all, it just tells us that every element in Omega gets the same probability.

    68 00:05:33,918 –> 00:05:37,900 Or to put it in other words. The probability mass function is constant.

    69 00:05:38,443 –> 00:05:43,587 So in our case we have 36 elements in Omega. So we need 1 over 36.

    70 00:05:43,787 –> 00:05:48,362 Ok so this is the probability space and now we can look at some events.

    71 00:05:48,562 –> 00:05:52,815 In the first step i would suggest lets describe the events with words.

    72 00:05:53,015 –> 00:05:58,262 “A” should be the event that the first throw shows a 6 and the second throw can show anything.

    73 00:05:58,462 –> 00:06:04,251 On the other hand B should be the event where we look at both throws and the sum of the numbers is 7.

    74 00:06:04,451 –> 00:06:08,985 Ok, then in the next step lets translate these descriptions into sets.

    75 00:06:09,400 –> 00:06:15,131 For the first one we just have a sample omega_1, omega_2, where omega_1 is equal to 6.

    76 00:06:15,629 –> 00:06:23,665 Then for the set B, we also have the set of all samples omega_1, omega_2. Where omega_1 + omega_2 should be 7.

    77 00:06:24,229 –> 00:06:28,130 Ok since the sets are defined we can calculate the probabilities.

    78 00:06:28,330 –> 00:06:33,419 Of course without much thinking we immediately get the probability for A, as 1 over 6.

    79 00:06:33,619 –> 00:06:40,664 But you can also calculate it when you see we have 6 elements in this set. So we have 6 divided by 36.

    80 00:06:40,864 –> 00:06:45,026 The same thing we can do for B, because we can list all the elements.

    81 00:06:45,226 –> 00:06:50,011 So we could start with the element, where the first throw is 1 and the second 6.

    82 00:06:50,211 –> 00:06:54,466 Then in the same way, if the first throw is 2 the second has to be 5.

    83 00:06:54,666 –> 00:06:57,970 and in this order we get all the possibilities.

    84 00:06:58,271 –> 00:07:02,323 and we count 6. So we have also 6 divided by 36.

    85 00:07:03,257 –> 00:07:08,199 So both events have the same probability, but this does not mean that they are independent.

    86 00:07:08,399 –> 00:07:12,891 However now using the definition we can just check if they are independent.

    87 00:07:13,329 –> 00:07:17,007 So lets simply calculate the probability of the intersection.

    88 00:07:17,207 –> 00:07:22,686 So from all the elements in B we have to take the ones that have a 6 at the first throw.

    89 00:07:23,186 –> 00:07:25,536 and you see we only find 1.

    90 00:07:25,900 –> 00:07:28,943 Hence the probability is 1 over 36.

    91 00:07:29,343 –> 00:07:33,553 Which is in fact the same as the product P(A) times P(B).

    92 00:07:33,753 –> 00:07:37,636 Therefore our 2 events A and B are independent.

    93 00:07:38,257 –> 00:07:42,439 Ok maybe it’s a good idea to look also at a continuous example.

    94 00:07:42,639 –> 00:07:46,977 So here we take the unit interval and throw a point into it.

    95 00:07:47,243 –> 00:07:51,731 Indeed this is a typical and simple absolutely continuous example.

    96 00:07:51,986 –> 00:07:55,477 and i think you already know how to choose the probability space.

    97 00:07:55,677 –> 00:07:59,349 Of course the sample space Omega should be the unit interval.

    98 00:07:59,714 –> 00:08:02,719 and the sigma algebra is the Borel sigma algebra.

    99 00:08:03,243 –> 00:08:07,317 Finally the probability measure we also call the uniform distribution.

    100 00:08:07,517 –> 00:08:15,277 However now we have it in the continuous case, which means instead of a probability mass function we have a probability density function.

    101 00:08:15,477 –> 00:08:19,483 Hence this uniform now means, we have a constant density function.

    102 00:08:19,683 –> 00:08:23,680 and here because we have the unit interval, we have the constant 1.

    103 00:08:23,880 –> 00:08:32,961 As a reminder this means when we want to measure the probability of an interval [a,b], we have the integral over this interval of the density function.

    104 00:08:33,343 –> 00:08:36,049 Which should be “b-a”.

    105 00:08:36,471 –> 00:08:42,654 However of course this only makes sense if we choose b greater than “a” and a,b in Omega.

    106 00:08:42,854 –> 00:08:48,225 I say this because often you see that the whole real number line is taken as the sample space

    107 00:08:48,425 –> 00:08:53,170 and then of course in order to have the same thing, we have to change the density function.

    108 00:08:53,370 –> 00:08:58,339 Indeed this is what we then call the indicator function of the unit interval.

    109 00:08:58,539 –> 00:09:02,194 It’s written as such a 1, where a set is in the index.

    110 00:09:02,394 –> 00:09:06,954 The definition is very simple. We only have 2 values, either 1 or 0.

    111 00:09:07,154 –> 00:09:13,248 and the important thing is: for all elements in this set here, so the unit interval we are at 1.

    112 00:09:13,448 –> 00:09:15,733 and otherwise we get out 0.

    113 00:09:16,200 –> 00:09:21,373 Now in the case that the function is defined on the whole real number line, the graph looks like this.

    114 00:09:21,573 –> 00:09:24,614 So we have one jump at 0 and one at 1.

    115 00:09:25,600 –> 00:09:31,220 So you see, taking this as the density function, the sample space R would also work.

    116 00:09:31,257 –> 00:09:35,486 However this is just a side note, lets get back to our problem of independence.

    117 00:09:35,686 –> 00:09:39,231 So lets take 2 independent events A and B.

    118 00:09:39,431 –> 00:09:43,357 Then we now by definition that the probabilities fulfill this equality.

    119 00:09:43,629 –> 00:09:47,257 Now lets use our probability measure and write the integral.

    120 00:09:47,586 –> 00:09:53,829 So we have the intersection A,B as the region we integrate over and the indicator function inside.

    121 00:09:54,029 –> 00:09:58,906 Here a nice property of the integral we can use is that we can exchange the sets.

    122 00:09:59,106 –> 00:10:02,766 So we have the indicator function of the set A intersected with B now

    123 00:10:02,786 –> 00:10:05,641 and [0,1] is the region we integrate over.

    124 00:10:06,343 –> 00:10:10,259 If you know the definition of the integral, you immediately see that this is correct.

    125 00:10:10,459 –> 00:10:14,007 For probability theory this is something we can use a lot.

    126 00:10:14,207 –> 00:10:17,481 For example we can also use it here on the right hand side.

    127 00:10:17,900 –> 00:10:21,369 So first lets fill in the definition of the probabilities here.

    128 00:10:22,014 –> 00:10:25,152 and then as before we can simply exchange the roles.

    129 00:10:25,643 –> 00:10:29,157 So what we get here is an equality of integrals.

    130 00:10:29,473 –> 00:10:34,936 This integral here on the left hand side with the intersection goes to a product of 2 integrals.

    131 00:10:35,136 –> 00:10:39,464 However, please recall we really need the independence to have this.

    132 00:10:39,900 –> 00:10:44,125 So with this you see independence can be very useful in calculations.

    133 00:10:44,325 –> 00:10:47,204 and of course we will use that in later videos.

    134 00:10:47,543 –> 00:10:52,243 Therefore i really hope i see you in the next video about probability theory.

    135 00:10:52,443 –> 00:10:54,614 Have a nice day and Bye!

  • Quiz Content

    Q1: Let $(\Omega, \mathcal{A}, \mathbb{P})$ be a probability space and $A,B \in \mathcal{A}$. What is the correct definition for independence of $A$ and $B$?

    A1: $$ \mathbb{P}(A \mid B) = P(A) \cdot P(B) $$

    A2: $$ \mathbb{P}(A \cup B) = P(A) \cdot P(B) $$

    A3: $$ \mathbb{P}(A \cap B) = P(A) \cdot P(B) $$

    A4: $$ \mathbb{P}(A \setminus B) = P(A) \cdot P(B) $$

    Q2: Let $(\Omega, \mathcal{A}, \mathbb{P})$ be a probability space and $B_1, B_2, B_3 \in \mathcal{A}$. What is the correct definition for independence of the family $(B_1, B_2, B_3)$?

    A1: $ \mathbb{P}(B_1 \cap B_2) = P(B_1) P(B_2) $ and $ \mathbb{P}(B_2 \cap B_3) = P(B_2) P(B_3) $ and $ \mathbb{P}(B_1 \cap B_3) = P(B_1) P(B_3) $

    A2: $\mathbb{P}(\bigcap_{j \in J} B_j) = \prod_{j \in J} \mathbb{P}(B_j)$ for all $J \subseteq { 1,2,3 }$.

    A3: $ \mathbb{P}(B_1 \cap B_2 \cap B_3) = P(B_1) P(B_2) P(B_3) $

    Q3: Consider throwing a die two times in a row. Let $A$ be the event that the first throw shows a $\textbf{2}$ and let $B$ be the event that the sum of both throws is $7$. Are $A$ and $B$ independent?

    A1: Yes, they are independent.

    A2: No, they are not independent.

  • Last update: 2024-10

  • Back to overview page


Do you search for another mathematical topic?