Processing math: 100%
Skip to main content

Section 5.3 Probability Functions

In the formulas below, we will presume that we have a random variable 5.2.1 X which maps the sample space S onto some range of real numbers R. From this set, we then can define a probability function f(x) which acts on the numerical values in R and returns another real number. We attempt to do so to obtain (for discrete values) P(sample space value s)=f(X(s)). That is, the probability of a given outcome s is equal to the composition which takes s to a numerical value x which is then plugged into f to get the same final values.

For example, consider a random variable which assigns a 1 when you roll a 1 on a six-sided die and 0 otherwise. Presuming each side is equally likely, f(1)=16 and f(0)=56.

Definition 5.3.1. Probability "Mass" Function.

Given a discrete random variable 5.2.1 X on a space R, a probability mass function on X is given by a function f:R→R such that:

∀x∈R,f(x)>0∑x∈Rf(x)=1A⊂R⇒P(X∈A)=∑x∈Af(x)

For x∉R, you can use the convention f(x)=0.

Definition 5.3.2. Probability "Density" Function.

Given a continuous random variable X on a space R, a probability density function on X is given by a function f:R→R such that:

∀x∈R,f(x)>0∫Rf(x)dx=1A⊂R⇒P(X∈A)=∫Af(x)dx

For x∉R, you can use the convention f(x)=0.

For the purposes of this book, we will use the term "Probability Function" to refer to either of these options.

Example 5.3.3. Discrete Probability Function.

Consider f(x)=x/10 over R = {1,2,3,4}. Then, f(x) is obviously positive for each of the values in R and certainly

∑x∈Rf(x)=f(1)+f(2)+f(3)+f(4)=1/10+2/10+3/10+4/10=1.

Therefore, f(x) is a probability mass function over the space R.

Example 5.3.4. Continuous Probability Function.

Consider f(x)=x2/c for some positive real number c and presume R = [-1,2]. Then f(x) is nonnegative (and only equals zero at one point). To make f(x) a probability density function 5.3.2, we must have

∫x∈Rf(x)=1.

In this instance you get

1=∫2−1x2/c=x3/(3c)|2−1=83c−−13c=3c

Therefore, f(x) is a probability density function over R provided c=3.

Definition 5.3.5. Distribution Function.

Given a random variable X on a space R, a probability distribution function on X is given by a function

F:R→R∋F(x)=P(X≤x).

Example 5.3.6. Discrete Distribution Function.

Using f(x)=x/10 over R = {1,2,3,4} again, note that F(x) will only change at these four domain values. We get

Table 5.3.7. Discrete Distribution Function Example
X F(x)
x<1 0
1≤x<2 1/10
2≤x<3 3/10
3≤x<4 6/10
4≤x 1

Example 5.3.8. Continuous Distribution Function.

Consider f(x)=x2/3 over R = [-1,2]. Then, for −1≤x≤2,

F(x)=∫x−1u2/3du=x3/9+1/9.

Notice, F(−1)=0 since nothing has yet been accumulated over values smaller than -1 and F(2)=1 since by that time everything has been accumulated. In summary:

Table 5.3.9. Continuous Distribution Function Example
X F(x)
x<−1 0
−1≤x<2 x3/9+1/9
2≤x 1

Let a = inf(\(R\)). Then, for \(x \lt a,\)

\begin{equation*} F(x) = P(X \le x) \le P(X \lt a) = 0 \end{equation*}

since none of the x-values in this range are in \(R\text{.}\)

Let b = sup(\(R\)). Then, for \(x \ge b,\)

\begin{equation*} F(x) = P(X \le x) = P(X \le b) + P( b \lt X \le x) = P(X \le b) = 1 \end{equation*}

since all of the x-values in this range are in R and therefore will either sum over or integrate over all of \(R\text{.}\)

Case 1: \(R\) discrete

\begin{align*} \forall x_1,x_2 \in \mathbb{Z} \ni x_1 \lt x_2\\ F(x_2) & = \sum_{x \le x_2} f(x) \\ & = \sum_{x \le x_1} f(x) + \sum_{x_1 \lt x \le x_2} f(x)\\ & \ge \sum_{x \le x_1} f(x) = F(x_1) \end{align*}

Case 2: \(R\) continuous

\begin{align*} \forall x_1,x_2 \in \mathbb{R} \ni x_1 \lt x_2\\ F(x_2) & = \int_{-\infty}^{x_2} f(x) dx \\ & = \int_{-\infty}^{x_1} f(x) dx + \int_{x_1}^{x_2} f(x) dx\\ & \ge \int_{-\infty}^{x_1} f(x) dx\\ & = F(x_1) \end{align*}

Assume \(x \in R\) for some discrete \(R\text{.}\) Then,

\begin{equation*} F(x) - F(x-1) = \sum_{u \le x} f(u) - \sum_{u \lt x} f(u) = f(x) \end{equation*}

For a and b as noted, consider

\begin{align*} F(b) - F(a) & = \int_{-\infty}^b f(x) dx - \int_{-\infty}^a f(x) dx\\ & = \int_a^b f(x) dx \\ & = P(a \lt x \le b) \end{align*}

We will assume that \(F(x)\) is a continuous function. With that assumption, note

\begin{equation*} P(a-\epsilon \lt x \le a) = \int_{a-\epsilon}^a f(x) dx = F(a) - F(a-\epsilon) \end{equation*}

Take the limit as \(\epsilon \rightarrow 0^+\) to get the result noting that

Assume \(X\) is continuous and \(f\) and \(F\) as above. Notice, by the definition of \(f\text{,}\) \(\lim_{x \rightarrow \pm \infty} f(x) = 0\) since otherwise the integral over the entire space could not be finite.

Now, let \(A(x)\) be any antiderivative of \(f(x)\text{.}\) Then, by the Fundamental Theorem of Calculus,

\begin{align*} F(x) & = \int_{-\infty}^x f(u) du\\ & = A(x) - \lim_{u \rightarrow -\infty} A(u) \end{align*}

Hence, \(F'(x) = A'(x) - \lim_{u \rightarrow -\infty} A'(u) = f(x)\) as desired.

Definition 5.3.17. Percentiles for Random Variables.

For 0<p<1, the 100pth percentile is the largest random variable value c that satisfies

F(c)=p.

For continuous random variables over an interval R=[a,b], you will solve for c in the equation

∫caf(x)dx.

For discrete random variables, it is unlikely that a particular percentile will land exactly on one of the elements of R but you will want to take the smallest value in R so that F(c)≥p.

The 50th percentile (as before) is also known as the median.

Example 5.3.18. Continuous Percentile.

For our earlier example with f(x)=x2/3 on R = [-1,2], the 50th percentile (i.e. the median) is found by starting with p = 0.5 and then solving

F(c)=0.5

or

c3/9+1/9=1/2

or

c3+1=9/2.

After solving for c, you find

median=3√7/2≈1.518.

Example 5.3.19. Discrete Percentile.

TBA, using one of the table examples from above.