Probability Functions

Section 5.3 Probability Functions

In the formulas below, we will presume that we have a random variable 5.2.1

$X$ which maps the sample space S onto some range of real numbers

$R\text{.}$ From this set, we then can define a probability function

$f(x)$ which acts on the numerical values in

$R$ and returns another real number. We attempt to do so to obtain (for discrete values) P(sample space value s)

$= f(X(s))\text{.}$ That is, the probability of a given outcome s is equal to the composition which takes s to a numerical value x which is then plugged into f to get the same final values.

🔗

For example, consider a random variable which assigns a 1 when you roll a 1 on a six-sided die and 0 otherwise. Presuming each side is equally likely,

$f(1) = \frac{1}{6}$ and

$f(0) = \frac{5}{6}\text{.}$

🔗

Definition 5.3.1. Probability "Mass" Function.

Given a discrete random variable 5.2.1 $X$ on a space $R\text{,}$ a probability mass function on $X$ is given by a function $f:R \rightarrow \mathbb{R}$ such that:

$\begin{align*} & \forall x \in R , f(x) \gt 0\\ & \sum_{x \in R} f(x) = 1\\ & A \subset R \Rightarrow P(X \in A) = \sum_{x \in A}f(x) \end{align*}$

For $x \not\in R\text{,}$ you can use the convention $f(x)=0\text{.}$

🔗

Definition 5.3.2. Probability "Density" Function.

Given a continuous random variable $X$ on a space $R\text{,}$ a probability density function on $X$ is given by a function $f:R \rightarrow \mathbb{R}$ such that:

$\begin{align*} & \forall x \in R , f(x) \gt 0\\ & \int_{R} f(x) dx = 1\\ & A \subset R \Rightarrow P(X \in A) = \int_{A} f(x) dx \end{align*}$

For $x \not\in R\text{,}$ you can use the convention $f(x)=0\text{.}$

🔗

For the purposes of this book, we will use the term "Probability Function" to refer to either of these options.

🔗

Example 5.3.3. Discrete Probability Function.

Consider $f(x) = x/10$ over R = {1,2,3,4}. Then, f(x) is obviously positive for each of the values in R and certainly

$\begin{equation*} \sum_{x \in R} f(x) = f(1) + f(2) + f(3) + f(4) = 1/10 + 2/10 + 3/10 + 4/10 = 1. \end{equation*}$

Therefore, f(x) is a probability mass function over the space $R\text{.}$

🔗

xxxxxxxxxx
 
# Combining all of the above into one interactive cell
@interact
def _(D = input_box([1,2,3,5,6,8,9,11,12,14],
                         label="[Domain] :",width=60), 
       Probs = input_box([1/20,1/20,1/20,3/20,1/20,4/20,4/20,1/20,1/20,3/20],
                         label=" $$[f(x)] :$$",width=60),
       n_samples=slider(100,10000,100,100,label="$$ \\text{# of samples:}$$")):
    n = len(D)
    R = range(n)
    one_huh = sum(Probs)
​
    if one_huh!=1:
        print("f(x) values do not sum to 1")
    else:
        G = Graphics()
        if len(D)==len(Probs):
            f = zip(D,Probs)
            meanf = 0
            variancef = 0
            for k in R:
                meanf += D[k]*Probs[k]
                variancef += D[k]^2*Probs[k]
                G += line([(D[k],0),(D[k],Probs[k])],color='green')
            variancef = variancef - meanf^2
            sd = sqrt(variancef)
            G += points(f,color='blue',size=50)
            G += point((meanf,0),color='yellow',size=60,zorder=3)
            G += line([(meanf-sd,0),(meanf+sd,0)],color='red',thickness=5)
    
            g = DiscreteProbabilitySpace(D,Probs)
            pretty_print("     mean = %s"%str(meanf))
            pretty_print(" variance = %s"%str(variancef))
    
        #  perhaps to add mean and variance for pmf here
        else:
            print("Domain and Probabilities Probs must be lists of the same size")
    
    #  sample from the distribution and see how a random sampling matches up
​
        counts = [0] * len(Probs)
        X = GeneralDiscreteDistribution(Probs)
        sample = []
​
        for _ in range(n_samples):
            elem = X.get_random_element()
            sample.append(D[elem])
            counts[elem] += 1
        Empirical = [1.0*x/n_samples for x in counts] # random
    
        samplemean = mean(sample)
        samplevariance = variance(sample)
        sampdev = sqrt(samplevariance)
    
        E = points(zip(D,Empirical),color='orange',size=40)
        E += point((samplemean,0.005),color='brown',size=60,zorder=3)
        E += line([(samplemean-sampdev,0.005),(samplemean+sampdev,0.005)],
                  color='orange',thickness=5)    
        (G+E).show(ymin=0,figsize=(5,4))

🔗

Example 5.3.4. Continuous Probability Function.

Consider $f(x) = x^2/c$ for some positive real number c and presume $R$ = [-1,2]. Then f(x) is nonnegative (and only equals zero at one point). To make $f(x)$ a probability density function 5.3.2, we must have

$\begin{equation*} \int_{x \in R} f(x) = 1. \end{equation*}$

In this instance you get

$\begin{equation*} 1 = \int_{-1}^2 x^2/c = x^3/(3c) |_{-1}^2 = \frac{8}{3c} - \frac{-1}{3c} = \frac{3}{c} \end{equation*}$

Therefore, $f(x)$ is a probability density function over $R$ provided $c = 3\text{.}$

🔗

Definition 5.3.5. Distribution Function.

Given a random variable $X$ on a space $R\text{,}$ a probability distribution function on $X$ is given by a function

$\begin{equation*} F:\mathbb{R} \rightarrow \mathbb{R} \ni \displaystyle F(x)=P(X \le x). \end{equation*}$

🔗

Example 5.3.6. Discrete Distribution Function.

Using $f(x) = x/10$ over $R$ = {1,2,3,4} again, note that $F(x)$ will only change at these four domain values. We get

Table 5.3.7. Discrete Distribution Function Example

X	F(x)
$x \lt 1$	0
$1 \le x \lt 2$	1/10
$2 \le x \lt 3$	3/10
$3 \le x \lt 4$	6/10
$4 \le x$	1

🔗

Example 5.3.8. Continuous Distribution Function.

Consider $f(x) = x^2/3$ over $R$ = [-1,2]. Then, for $-1 \le x \le 2\text{,}$

$\begin{equation*} F(x) = \int_{-1}^x u^2/3 du = x^3/9 + 1/9. \end{equation*}$

Notice, $F(-1) = 0$ since nothing has yet been accumulated over values smaller than -1 and $F(2) = 1$ since by that time everything has been accumulated. In summary:

Table 5.3.9. Continuous Distribution Function Example

X	F(x)
$x \lt -1$	0
$-1 \le x \lt 2$	$x^3/9 + 1/9$
$2 \le x$	1

🔗

Theorem 5.3.10.

$F(x)=0, \forall x \lt \inf(R)$ where inf is the infimum...the "minimum" but in a limit sense.

🔗

Proof.

Let a = inf($R$). Then, for $x \lt a,$

\begin{equation*} F(x) = P(X \le x) \le P(X \lt a) = 0 \end{equation*}

since none of the x-values in this range are in $R\text{.}$

🔗

Theorem 5.3.11.

$F(x)=1, \forall x \ge \sup(R)$ where sup is the supremum...the "maximum" but in a limit sense.

🔗

Proof.

Let b = sup($R$). Then, for $x \ge b,$

\begin{equation*} F(x) = P(X \le x) = P(X \le b) + P( b \lt X \le x) = P(X \le b) = 1 \end{equation*}

since all of the x-values in this range are in R and therefore will either sum over or integrate over all of $R\text{.}$

🔗

Theorem 5.3.12.

$F$ is non-decreasing

🔗

Proof.

Case 1: $R$ discrete

\begin{align*} \forall x_1,x_2 \in \mathbb{Z} \ni x_1 \lt x_2\\ F(x_2) & = \sum_{x \le x_2} f(x) \\ & = \sum_{x \le x_1} f(x) + \sum_{x_1 \lt x \le x_2} f(x)\\ & \ge \sum_{x \le x_1} f(x) = F(x_1) \end{align*}

Case 2: $R$ continuous

\begin{align*} \forall x_1,x_2 \in \mathbb{R} \ni x_1 \lt x_2\\ F(x_2) & = \int_{-\infty}^{x_2} f(x) dx \\ & = \int_{-\infty}^{x_1} f(x) dx + \int_{x_1}^{x_2} f(x) dx\\ & \ge \int_{-\infty}^{x_1} f(x) dx\\ & = F(x_1) \end{align*}

🔗

Theorem 5.3.13. Using Discrete Distribution Function to compute probabilities.

For $x \in R, f(x) = F(x) - F(x-1)$

🔗

Proof.

Assume $x \in R$ for some discrete $R\text{.}$ Then,

\begin{equation*} F(x) - F(x-1) = \sum_{u \le x} f(u) - \sum_{u \lt x} f(u) = f(x) \end{equation*}

🔗

Theorem 5.3.14. Using Continuous Distribution function to compute probabilities.

For $a \lt b, (a,b) \in R, P(a \lt X \le b) = F(b) - F(a)$

🔗

Proof.

For a and b as noted, consider

\begin{align*} F(b) - F(a) & = \int_{-\infty}^b f(x) dx - \int_{-\infty}^a f(x) dx\\ & = \int_a^b f(x) dx \\ & = P(a \lt x \le b) \end{align*}

🔗

Corollary 5.3.15.

For continuous distributions, $P(X = a) = 0\text{.}$

🔗

Proof.

We will assume that $F(x)$ is a continuous function. With that assumption, note

\begin{equation*} P(a-\epsilon \lt x \le a) = \int_{a-\epsilon}^a f(x) dx = F(a) - F(a-\epsilon) \end{equation*}

Take the limit as $\epsilon \rightarrow 0^+$ to get the result noting that

🔗

Theorem 5.3.16. $F(x)$ vs $f(x)\text{,}$ for continuous distributions.

If $X$ is a continuous random variable, $f$ the corresponding probability function, and $F$ the associated distribution function, then

$\begin{equation*} f(x) = F'(x) \end{equation*}$

🔗

Proof.

Assume $X$ is continuous and $f$ and $F$ as above. Notice, by the definition of $f\text{,}$ $\lim_{x \rightarrow \pm \infty} f(x) = 0$ since otherwise the integral over the entire space could not be finite.

Now, let $A(x)$ be any antiderivative of $f(x)\text{.}$ Then, by the Fundamental Theorem of Calculus,

\begin{align*} F(x) & = \int_{-\infty}^x f(u) du\\ & = A(x) - \lim_{u \rightarrow -\infty} A(u) \end{align*}

Hence, $F'(x) = A'(x) - \lim_{u \rightarrow -\infty} A'(u) = f(x)$ as desired.

🔗

Definition 5.3.17. Percentiles for Random Variables.

For $0 \lt p \lt 1\text{,}$ the $100p^{th}$ percentile is the largest random variable value c that satisfies

$\begin{equation*} F(c) = p. \end{equation*}$

For continuous random variables over an interval $R = [a,b]\text{,}$ you will solve for c in the equation

$\begin{equation*} \int_a^c f(x) dx. \end{equation*}$

For discrete random variables, it is unlikely that a particular percentile will land exactly on one of the elements of $R$ but you will want to take the smallest value in $R$ so that $F(c) \ge p\text{.}$

The 50th percentile (as before) is also known as the median.

🔗

Example 5.3.18. Continuous Percentile.

For our earlier example with $f(x) = x^2/3$ on R = [-1,2], the 50th percentile (i.e. the median) is found by starting with p = 0.5 and then solving

$\begin{equation*} F(c) = 0.5 \end{equation*}$

$\begin{equation*} c^3/9 + 1/9 = 1/2 \end{equation*}$

$\begin{equation*} c^3 + 1 = 9/2. \end{equation*}$

After solving for c, you find

$\begin{equation*} \text{median} = \sqrt[3]{7/2} \approx 1.518. \end{equation*}$

🔗

Example 5.3.19. Discrete Percentile.

TBA, using one of the table examples from above.

🔗

Section 5.3 Probability Functions

Definition 5.3.1. Probability "Mass" Function.

Definition 5.3.2. Probability "Density" Function.

Example 5.3.3. Discrete Probability Function.

Example 5.3.4. Continuous Probability Function.

Definition 5.3.5. Distribution Function.

Example 5.3.6. Discrete Distribution Function.

Example 5.3.8. Continuous Distribution Function.

Theorem 5.3.10.

Proof.

Theorem 5.3.11.

Proof.

Theorem 5.3.12.

Proof.

Theorem 5.3.13. Using Discrete Distribution Function to compute probabilities.

Proof.

Theorem 5.3.14. Using Continuous Distribution function to compute probabilities.

Proof.

Corollary 5.3.15.

Proof.

Theorem 5.3.16. F(x)F(x) vs f(x),f(x)\text{,} for continuous distributions.

Proof.

Definition 5.3.17. Percentiles for Random Variables.

Example 5.3.18. Continuous Percentile.

Example 5.3.19. Discrete Percentile.

Theorem 5.3.16. $F(x)$ vs $f(x)\text{,}$ for continuous distributions.