Monday, July 7, 2014

Law of Iterated Expectation

Let $X$ and $Y$ denote continuous, random, real-valued variables with joint probability density function $f(X, Y)$. The marginal density function of $Y$ is $f_Y(y) := \int_{x \in \mathbb R} f(x, y) dx$. The expectation $E(Y)$ of $Y$ can be recovered by integrating against the marginal density function. In particular, \begin{equation}\label{eq:E(Y)} E(Y)=\int_{y \in \mathbb R} y f_Y(y) dy = \int_{y \in \mathbb R} \int_{x \in \mathbb R} y f(x, y) \ dx \ dy. \end{equation} The conditional probability density function of $Y$ given that $X$ is equal to some value $x$ is defined by \begin{equation}\label{eq:cdf} f_{Y \mid X} (y \mid X = x):= f_{Y \mid X} (y \mid x)= \frac{f(x, y)}{f_X(x)} = \frac{f(x,y)}{\int_{y \in \mathbb R} f(x, y) \ dy}. \end{equation} The conditional expectation $E(Y \mid X = x)$ of $Y$ given that $X$ has value $x$ is given by \begin{equation}\label{eq:cond exp} E(Y \mid X = x) = \int_{y \in \mathbb R} y f_{Y \mid X} (y \mid x) \ dy. \end{equation} But $E(Y \mid X = x)$ depends on $X$, so in turn is itself a random variable denoted $E(Y \mid X)$, whence we can compute its expectation. Now \begin{align*} E (E (Y \mid X)) &= \int_{x \in \mathbb R} E(Y \mid x) f_X(x) \ dx \\ & = \int_{x \in \mathbb R} f_X(x) \left( \int_{y \in \mathbb R} y f_{Y \mid X}(x, y) \ dy \right) \ dx \quad \text{(by \ref{eq:cond exp} )} \\ & = \int_{x \in \mathbb R} \int_{y \in \mathbb R} f_X(x) \cdot y \frac{f(x,y)}{f_X(x)} \ dy \ dx \quad \text{(by \ref{eq:cdf})}\\ & = \int_{y \in \mathbb R} \int_{x \in \mathbb R} y f(x, y) \ dx \ dy \\ & = E(Y). \quad \text{(by \ref{eq:E(Y)})} \end{align*} This result is sometimes called the law of the iterated expectation.

Tuesday, July 1, 2014

Distributions, Densities, and Mass Functions

One unfortunate aspect of probability theory is that common measure theoretic constructions are given different, often conflated, names. The technical definitions of the probability distribution, probability density function, and probability mass function for a random variable are all related to each other, as this post hopes to make clear.

We begin with some general measure theoretic constructions. Let $(A, \mathcal A)$ and $(B, \mathcal B)$ be measurable spaces and suppose that $\mu$ is a measure on $(A, \mathcal A)$. Any $(\mathcal A, \mathcal B)$-measurable function $X \colon A \to B$ induces a push-forward measure $X_*(\mu)$ on $(B, \mathcal B)$ via: \[ [X_*(\mu)](S \in \mathcal B):= \mu(X^{-1} (S)). \]