
Title: Discrete vs. Continuous Case

Series: Probability Theory

Bright video: https://youtu.be/XZHi51fB9A

Dark video: https://youtu.be/ogqDWFrk6hg

Quiz: Test your knowledge

Thumbnail (bright): Download PNG

Timestamps
00:00 Intro 00:48 Introduction of cases 02:17 Sample Space (discrete case) 02:33 Sample Space (continuous case) 03:14 Sigma algebra (discrete case) 03:36 Sigma algebra (continuous case) 03:59 Probability measure (discrete case) 05:41 Probability measure (continuous case) 07:46 Example (discrete case) 08:44 Example (continuous case) 10:38 Outro 10:57 Endcard 
Subtitle in English
1 00:00:00,314 –> 00:00:03,586 Hello and welcome back to probability theory
2 00:00:04,500 –> 00:00:08,114 and first i want to thank all the nice supporters on Steady and Paypal.
3 00:00:08,714 –> 00:00:14,543 and in todays part 3 we will talk about two important cases that can occur for stochastic problems.
4 00:00:15,200 –> 00:00:21,740 For this please recall we have already introduced a lot of notions and we started with a general sample space omega.
5 00:00:22,571 –> 00:00:27,772 Then we can look at chosen subsets in this omega which form a sigma algebra
6 00:00:28,443 –> 00:00:31,474 and exactly these subsets we call events.
7 00:00:32,186 –> 00:00:36,266 Then in the next step we want to measure the probability of such an event
8 00:00:36,757 –> 00:00:40,800 and this leads us to a general probability measure we call “P”.
9 00:00:41,600 –> 00:00:48,371 Now i can tell you it’s possible to deal with these objects in this abstract sense and then we get a general theory.
10 00:00:48,900 –> 00:00:54,628 However in a lot of applications we find that two special cases are very important here.
11 00:00:55,214 –> 00:00:59,886 So now we distinguish between the discrete case and the continuous case
12 00:01:00,600 –> 00:01:05,400 To be more precise i would also speak of the absolutely continuous case.
13 00:01:06,143 –> 00:01:12,814 Now all the other possibilities we can’t put into these two boxes we will ignore at least for this video.
14 00:01:13,329 –> 00:01:18,671 Indeed often at the start of probability theory one focuses at discrete problems.
15 00:01:19,300 –> 00:01:23,029 They are easy to explain, when we only have finitely many outcomes.
16 00:01:23,743 –> 00:01:29,057 However i would also say that we have a discrete problem when we have infinitely many outcomes
17 00:01:29,257 –> 00:01:32,700 if they still are countable like the natural numbers.
18 00:01:33,471 –> 00:01:41,471 For example we could throw a die infinitely many times and we count how many throws we needed to get the first six
19 00:01:42,057 –> 00:01:46,943 Then we are still in the discrete case, because we can count all the possible outcomes.
20 00:01:47,567 –> 00:01:54,200 On the other hand in the absolutely continuous case we have infinitely many outcomes, but they are uncountable.
21 00:01:54,667 –> 00:01:58,733 The typical example here would be a dart board where you throw a dart.
22 00:01:59,500 –> 00:02:03,083 There all the values in the disk are possible outcomes.
23 00:02:03,533 –> 00:02:09,999 Okay so that’s the rough idea for the two special cases here and now i would say let’s go into the details.
24 00:02:10,950 –> 00:02:16,567 For this let’s do a table such that we can compare the discrete case and the absolutely continuous case.
25 00:02:17,133 –> 00:02:19,617 First let’s start with the sample space omega.
26 00:02:19,650 –> 00:02:23,017 Which is a finite or countable set in the discrete case.
27 00:02:23,500 –> 00:02:27,650 For example if you flip a coin. Omega would be a set with two elements.
28 00:02:28,267 –> 00:02:32,300 Or giving an infinite example. Omega could be the natural numbers.
29 00:02:33,050 –> 00:02:38,150 On the other hand for our continuous case the sample space omega should be an uncountable set.
30 00:02:38,717 –> 00:02:41,983 and usually we can choose it as a subset of “R^n”
31 00:02:43,100 –> 00:02:46,850 To be even more precise omega should be a so called Borel set.
32 00:02:47,833 –> 00:02:51,467 So it’s an element of the Borel sigma algebra of “R^n”.
33 00:02:52,033 –> 00:02:56,717 If you don’t know what the borel sigma algebra is don’t worry. I have a whole video about it.
34 00:02:57,317 –> 00:03:01,617 However at least for the sake of this video it’s not the most important thing to know here.
35 00:03:02,450 –> 00:03:06,233 Just think of a common example. Omega could be the unit interval.
36 00:03:06,900 –> 00:03:09,317 I choose it as closed here, but it could also be open.
37 00:03:10,017 –> 00:03:13,417 Ok, that is what you should know about possible sample spaces.
38 00:03:14,583 –> 00:03:17,650 Then in the next step let’s talk about the sigma algebras.
39 00:03:18,583 –> 00:03:22,883 In the discrete case it’s very simple. You can just take the whole power set of omega.
40 00:03:23,333 –> 00:03:30,200 Of course depending on the problem you could choose a smaller one, but there is no restriction for choosing the largest one, the power set.
41 00:03:30,850 –> 00:03:35,467 Therefore in the discrete case we don’t have to care about the sigma algebra at all.
42 00:03:36,217 –> 00:03:41,683 However we really need the notion of a sigma algebra on the right hand side. In the continuous case.
43 00:03:42,233 –> 00:03:48,533 There in general it’s not possible to choose the power set, but it’s always possible to take the Borel sigma algebra.
44 00:03:49,400 –> 00:03:53,967 This means that we don’t have all the subsets of omega, but still a lot of them.
45 00:03:54,550 –> 00:03:59,267 Therefore our probability measure can still give probabilities to a lot of events.
46 00:03:59,850 –> 00:04:04,517 Speaking of probability measures this is the next thing we want to compare in both cases.
47 00:04:05,183 –> 00:04:10,783 In the discrete case measuring a singleton, so a set with only one element, is very useful,
48 00:04:11,433 –> 00:04:18,750
because if you know these numbers for all lowercase omega in the sample space omega you know the whole probability measure.
49 00:04:19,733 –> 00:04:23,583 This immediately comes out of the sigma additivity of the probability measure.
50 00:04:24,550 –> 00:04:28,667 As a reminder it’s this property here we discussed in the last video.
51 00:04:29,450 –> 00:04:38,117 Now because of this property in the discrete case instead of the probability measure we can equivalently write down a probability mass function.
52 00:04:38,868 –> 00:04:43,750 Usually one uses a lowercase “p” for this and omega is found in the index.
53 00:04:44,417 –> 00:04:50,500 Of course in the end it should have the same meaning as this probability, but now this function is our starting point.
54 00:04:51,200 –> 00:04:54,467 I say function, but usually we write it as a sequence.
55 00:04:54,900 –> 00:04:59,322 Depending on omega the sample space is a finite sequence or a countable one.
56 00:05:00,044 –> 00:05:05,383 Now because we want to use this for probabilities we claim that this number is always nonnegative.
57 00:05:06,250 –> 00:05:10,867 and also the series or sum through all omegas should be exactly one.
58 00:05:11,567 –> 00:05:16,899 If we have such a sequence that fulfills these two properties we call it a probability mass function
59 00:05:17,517 –> 00:05:20,500 and with this we can define the probability measure.
60 00:05:21,367 –> 00:05:29,917 For any event “A” we can set P(A) the sum or series over “p_omega” where omega goes through all the elements in “A”.
61 00:05:30,650 –> 00:05:37,383 There you see the big advantage in the discrete case. We just have countably many numbers involved and also only sums.
62 00:05:38,367 –> 00:05:41,100 So each probability can be written as such a sum.
63 00:05:41,817 –> 00:05:45,617 On the other hand that’s is not possible in the absolutely continuous case.
64 00:05:46,133 –> 00:05:49,609 However for a probability measure there we have something similar,
65 00:05:50,500 –> 00:05:56,383 because like in the example with the dartboard the probability of a single point is just zero.
66 00:05:57,083 –> 00:06:01,383 There are just too many points to get a nonzero probability for a single one.
67 00:06:02,217 –> 00:06:05,633 However we can say something about the density of the probabilities
68 00:06:06,450 –> 00:06:12,667 or to put it in other words it’s no problem at all to measure the probability of a whole region instead of a single point.
69 00:06:13,617 –> 00:06:18,767 Now this density function we simply call “f” and it’s defined on the whole sample space omega
70 00:06:19,383 –> 00:06:24,517 and because we want to measure probabilities we have the same two properties as on the left hand side.
71 00:06:25,350 –> 00:06:29,067 The first one is at each point we have a nonnegative number
72 00:06:29,533 –> 00:06:35,133 and you see because in this case omega is a subset of “R^n”. We usually use the letter “x”.
73 00:06:36,017 –> 00:06:38,767 but still here “x” is an element of omega.
74 00:06:39,550 –> 00:06:44,633 Ok now for the second property we have to translate this sum into a continuous case.
75 00:06:45,150 –> 00:06:49,900 This means that here we want that the integral of the function “f” is equal to 1.
76 00:06:50,733 –> 00:06:54,167 It’s the integral where we integrate over the whole domain omega.
77 00:06:55,083 –> 00:07:01,400 Please note here in our simple example it would be a onedimensional integral, but in general we have an ndimensional integral
78 00:07:02,100 –> 00:07:05,950 but then you see it’s completely similar to the thing we wanted in the discrete case.
79 00:07:06,683 –> 00:07:12,883 Therefore in the same way the probability measure can be defined, P(A) is the integral where we have the domain “A” .
80 00:07:13,950 –> 00:07:17,760 Ok and with this you see this is our translation between both cases.
81 00:07:18,633 –> 00:07:25,533 However to be honest i omitted one technical detail here on the right hand side, because here we have to deal with sigma algebras.
82 00:07:26,150 –> 00:07:30,550 For this reason this density function “f” here needs to be measurable.
83 00:07:31,533 –> 00:07:35,050 It’s a property we need such that all the integrals here make sense.
84 00:07:35,783 –> 00:07:40,217 If you don’t know this term measurable yet, don’t worry we will talk about it later.
85 00:07:41,083 –> 00:07:45,467 At the moment i would say it’s sufficient that you know that we need some technical detail here.
86 00:07:46,233 –> 00:07:51,333 Ok, i think that’s enough for the theory. Lets talk about some practical examples.
87 00:07:52,317 –> 00:07:56,333 In the discrete case let’s look again at the example of throwing one die.
88 00:07:57,100 –> 00:08:00,133 However maybe this time let’s take an unfair die.
89 00:08:00,983 –> 00:08:05,083 This means that we have different probabilities in the probability mass function.
90 00:08:05,683 –> 00:08:10,917 So we can bring different nonnegative probabilities in, but they have to sum up to 1.
91 00:08:11,800 –> 00:08:18,950 For example we could set all the numbers one to five to one tenth and then the probability of getting a six would be one half.
92 00:08:19,700 –> 00:08:25,550 So maybe as a test let’s calculate the probability of the event that we don’t get a 6.
93 00:08:26,367 –> 00:08:34,683 Now by definition this would be the sum of omega going from 1 to 5 and there we have our “p_omega”.
94 00:08:35,400 –> 00:08:43,883 Which in our case is exactly 1 over 10 for all omegas involved here and there you see we get out one half as expected.
95 00:08:44,950 –> 00:08:48,450 Ok, then on the other side let’s look at a continuous example.
96 00:08:49,133 –> 00:08:52,633 Ok, maybe let’s take the interval from 0 to 2 here.
97 00:08:53,350 –> 00:08:56,850 So you see our dart board from before is onedimensional now.
98 00:08:57,533 –> 00:09:00,850 So we randomly throw one point into this interval.
99 00:09:01,567 –> 00:09:05,617 Then the question would be what is now our probability density function “f”?
100 00:09:06,583 –> 00:09:11,817 Now because we want to have an uniform probability here we need to take a constant function
101 00:09:12,483 –> 00:09:17,333 and because we want to fulfill these two properties there is only one reasonable choice.
102 00:09:18,233 –> 00:09:21,150 We take the constant function with value one half.
103 00:09:21,967 –> 00:09:29,167 So maybe let’s check that the integral property is indeed fulfilled. So we have the integral 0 to 2 and f(x) inside.
104 00:09:30,083 –> 00:09:35,583 Now we can pull the constant one half outside and then only the length of the interval here remains.
105 00:09:36,217 –> 00:09:39,100 Which is of course 2 such that we get out 1.
106 00:09:39,917 –> 00:09:45,233 Ok, then maybe the last question here is can you calculate the probability of a subset “A” here?
107 00:09:45,950 –> 00:09:50,967 As we know the probability is defined as the integral where we integrate over the set “A” here.
108 00:09:51,833 –> 00:09:56,883 Doing the same as before. We can pull out the constant and only the simple integral remains.
109 00:09:58,017 –> 00:10:05,667 Now this one is what we call in general the volume of “A” or in this case, because it’s only onedimensional we could call it the length of “A”.
110 00:10:06,700 –> 00:10:09,842 or we could simply say it’s the lebesgue measure of “A”.
111 00:10:11,383 –> 00:10:16,805 This sounds now more complicated than it really is, because for intervals we can immediately calculate it.
112 00:10:17,633 –> 00:10:24,763 For example if we have the interval [a, b] we can calculate that this is 0.5*(b  a).
113 00:10:25,317 –> 00:10:30,784 So you could say in this example we just calculate lengths, but then we have to normalize it.
114 00:10:31,400 –> 00:10:34,309 Which means we divide by the full length which is 2.
115 00:10:35,133 –> 00:10:37,142 and indeed then we get a probability.
116 00:10:38,083 –> 00:10:43,340 Ok, with this i hope you now know how to distinguish between discrete and continuous cases.
117 00:10:44,250 –> 00:10:48,185 They are indeed the typical examples that occur. Therefore it’s good to know them.
118 00:10:48,800 –> 00:10:52,479 Then in the next videos we will look at more complicated examples.
119 00:10:52,983 –> 00:10:56,333 Therefore i hope i see you there and have a nice day. Bye!