Distributions related to the normal

The normal distribution is used to define other distributions by taking ‘squares’.

In this post I will try to:

make clear what how the normal distribution can be used to define the chi-squared and the Wishart distributions;
explain how the chi-squared and the Wishart distribution can be used to ‘solve’ some common integrals;

The chi-square distribution

Definition 1. Let $x$ be a $k$-dimensional vector of independent standard normal random variables and let \[q = x'x = \sum_i^k x_i^2\] then I say that $q$ has a chi-square distribution with $k$ degrees of freedom and I write \[q \sim \chi^2(k).\]

An alternative definition is the following.

Definition 2. The random variable $q$ has a chi-square distribution with $k$ degrees of freedom (written $q\sim {\chi ^2}\left( k \right)$) when its density is \[pdf\left( q \right) = \frac{{\exp \left\{ { - q/2} \right\}{q^{\left( {k/2} \right) - 1}}}}{{{2^{k/2}}\Gamma \left( {k/2} \right)}},\] for $q>0$.

The quantity $\Gamma \left( \alpha \right)$ is called the Gamma function and it is defined by the integral \[\Gamma \left( \alpha \right) = \int\limits_{t > 0} {\exp \left\{ { - t} \right\}{t^{\alpha - 1}}dt} .\] Suppose I transform the variable of integration $t \to t = aw$, a>0, and

\[\begin{array}{c} \Gamma \left( \alpha \right) = \int\limits_{x > 0} {\exp \left\{ { - aw} \right\}} {\left( {aw} \right)^{\alpha - 1}}adw\\ = {a^\alpha }\int\limits_{x > 0} {\exp \left\{ { - aw} \right\}{w^{\alpha - 1}}dw} . \end{array}\]

By rearranging the last display I have

Theorem 3. Let $a>0$ and $\alpha > 0$, then

\[\int\limits_{x > 0} {\exp \left\{ { - aw} \right\}{w^{\alpha - 1}}dw} = \frac{{\Gamma \left( \alpha \right)}}{{{a^\alpha }}}. \tag{1}\]

This integral is very important and often comes up in applications. Notice that it can be written as

\[\int\limits_{x > 0} {\frac{{{a^\alpha }\exp \left\{ { - aw} \right\}{w^{\alpha - 1}}}}{{\Gamma \left( \alpha \right)}}dw} = 1,\]

so that by choosing $a = 1/2$ and $\alpha = k/2$ I can see that the density of a chi-square distribution does indeed integrate to 1.

Here is a result that can be easily established using the integral above

Exercise 4. If $q\sim {\chi ^2}\left( k \right)$ then $E\left( q \right) = k$ and ${\mathop{\rm var}} \left( q \right) = 2k$.

The link between Definitions 1 and 2 is:

Theorem 5. If $x\sim N\left( {0,{\sigma ^2}{I_k}} \right)$ then $q = x'x/{\sigma ^2}\sim {\chi ^2}\left( k \right)$.

I will return to this point later on after a review of techniques to calculate excact distributions.

Quadratic forms and chi-square distribution

It is important to know when a quadratic form in normal random variable is a chi-square distribution. The following result is useful in many situations.

Theorem 6. If $q = x'Ax$ and $x\sim N\left( {0,\Sigma } \right)$, then $q\sim {\chi ^2}\left( k \right)$ if and only if $A\Sigma$ is idempotent (i.e. $\left( {A\Sigma } \right)\left( {A\Sigma } \right) = A\Sigma$), in this case $k = tr\left\{ {A\Sigma } \right\} = rank\left\{ {A\Sigma } \right\}$.

Now consider the case where $A$ is a random matrix independent of $x$ and $x\sim N\left( {0,\Sigma } \right)$. Then if $A\Sigma$ is idempotent and $tr\left\{ {A\Sigma } \right\} = k$ and I let \[q = x'Ax,\] I have that \[q|A\sim {\chi ^2}\left( k \right).\]

Since the density of $q|A$ does not depend on A, it follows that $q\sim {\chi ^2}\left( k \right)$ unconditionally.

Example 7. Consider a linear regression model $y = X\beta + u$ where $u|X\sim N\left( {0,{\sigma ^2}{I_n}} \right)$, and X is an ($n \times k$) random matrix of rank k with probability one. Let $q = \frac{{y'\left[ {{I_n} - X{{\left( {X'X} \right)}^{ - 1}}X'} \right]y}}{{{\sigma ^2}}}$be the sum of squared residuals divided by the error variance. It can be easily shown that $q = \frac{{u'\left[ {{I_n} - X{{\left( {X'X} \right)}^{ - 1}}X'} \right]u}}{{{\sigma ^2}}}.$

Now define $A = {I_n} - X{\left( {X'X} \right)^{ - 1}}X'$ and $x = u/\sigma \sim N\left( {0,{I_n}} \right)$, so that $q = x'Ax$ and $x|A\sim N\left( {0,{I_n}} \right).$ Note that $A{I_n}$ is idempotent and that $tr\left\{ {A{I_n}} \right\} = tr\left\{ A \right\} = n - k$ so I can conclude that \[q|A\sim {\chi ^2}\left( {n - k} \right).\] Since the density of $q|A$ does not depend on $A$ I can say that $q\sim {\chi ^2}\left( {n - k} \right)$ unconditionally.

Definition 8. I say that $q>0$ has a non-central chi-square distribution with $k$ degrees of freedom and non-centrality parameter $\lambda$ (and write$q\sim {\chi ^2}\left( {k,\lambda } \right)$) if its density is

\[pdf\left( q \right) = \frac{{\exp \left\{ { - q/2} \right\}{q^{\left( {k/2} \right) - 1}}}}{{{2^{k/2}}\Gamma \left( {k/2} \right)}}\left\{ {\exp \left\{ { - \lambda /2} \right\}\sum\limits_{j = 0}^\infty {\frac{{{{\left( {\lambda q/4} \right)}^j}}}{{j!{{\left( {k/2} \right)}_j}}}} } \right\} \tag{2}\]

where \[\begin{array}{l} {\left( a \right)_0} = 1\\ {\left( a \right)_j} = a\left( {a + 1} \right)\left( {a + 2} \right)...\left( {a + j - 1} \right) = \frac{{\Gamma \left( {a + j} \right)}}{{\Gamma \left( a \right)}}, \end{array}\] $j = 1,2,3,...$, is the forward factorial (or Pochhammer symbol).

Theorem 9. If \[x\sim N\left( {\mu ,\Sigma } \right)\] and $q = x'Ax$, then $q\sim {\chi ^2}\left( {k,\lambda } \right)$ if and only if $A\Sigma$ is idempotent. Then $k = tr\left\{ {A\Sigma } \right\}$ and $\lambda = \mu 'A\mu$.

Corollary 10. If $x\sim N\left( {\mu ,{\sigma ^2}{I_k}} \right)$ then $q = x'x/{\sigma ^2}\sim {\chi ^2}\left( {k,\mu '\mu /{\sigma ^2}} \right)$.

The Wishart distribution

The Wishart distribution is a multivariate extension of the chi-square distribution. To define this, I need first to extend the multivariate normal distibution to a matrix-variate normal distribution.

Definition 11. Let ${x_i}\sim N\left( {{\mu _i},\Sigma } \right)$, where $\Sigma$ is an ($m \times m$) positive definite matrix, for $i = 1,2,...,n$. Suppose, they are independent. Then the ($n \times m$) random matrix

\[X = \left( \begin{array}{l} {x_1}'\\ {x_2}'\\ \vdots \\ {x_n}' \end{array} \right) \tag{3}\]

has a matrix-variate normal distribution with density

\[pdf\left( X \right) = {\left( {2\pi } \right)^{ - nm/2}}{\left| \Sigma \right|^{ - n/2}}etr\left\{ { - \frac{1}{2}{\Sigma ^{ - 1}}\left( {X - M} \right)'\left( {X - M} \right)} \right\} \tag{4}\]

where

\[M = \left( \begin{array}{l} {\mu _1}'\\ {\mu _2}'\\ \vdots \\ {\mu _n}' \end{array} \right),\]

and $etr\left\{ A \right\}$ means $\exp \left\{ {tr\left[ A \right]} \right\}$ for any square matrix A. I write $X\sim N\left( {M,{I_n} \otimes \Sigma } \right)$.

Definition 12. If the ($n \times m$) random matrix $X\sim N\left( {0,{I_n} \otimes \Sigma } \right)$, then the ($m \times m$) random matrix $S = X'X$ is said to have a Wishart distribution with n degrees of freedom and covariance matrix $$, and I write$S{W_m}( {n,} )$.

Notice that the matrix S is symmetric and positive definite.

Theorem 13. If $n > m$ the density function of $S\sim {W_m}\left( {n,\Sigma } \right)$ is

\[pdf\left( S \right) = \frac{{{2^{ - nm/2}}{{\left| \Sigma \right|}^{ - n/2}}}}{{{\Gamma _m}\left( {n/2} \right)}}etr\left\{ { - \frac{1}{2}{\Sigma ^{ - 1}}S} \right\}{\left| S \right|^{\left( {n - m - 1} \right)/2}},\] (2.13)

where ${\Gamma _m}\left( a \right)$ denotes the multivariate gamma function

\[{\Gamma _m}\left( a \right) = {\pi ^{m\left( {m - 1} \right)/4}}\prod\limits_{i = 1}^m {\Gamma \left[ {a - \frac{1}{2}\left( {i - 1} \right)} \right]} . \tag{5}\]

If $m = 1$, $\Sigma = {\sigma ^2}$, $S = x'x$ and $S\sim {W_1}\left( {n,{\sigma ^2}} \right)$ is a scalar random variable. The density of S simplifies to

\[pdf\left( S \right) = \frac{{{{\left( {2{\sigma ^2}} \right)}^{ - n/2}}}}{{\Gamma \left( {n/2} \right)}}etr\left\{ { - \frac{1}{{2{\sigma ^2}}}S} \right\}{S^{n/2 - 1}}.\]

One can see that this is very similar to a ${\chi ^2}\left( n \right)$. Precisely, I have that

\[\frac{S}{{{\sigma ^2}}}\sim {\chi ^2}\left( n \right),\] so the Wishart can be seen as a generalization of the chi-square distribution.

The density of $S$ must integrate to $1$, that is

\[\begin{array}{l} \int\limits_{S > 0} {\frac{{{2^{ - nm/2}}{{\left| \Sigma \right|}^{ - n/2}}}}{{{\Gamma _m}\left( {n/2} \right)}}etr\left\{ { - \frac{1}{2}{\Sigma ^{ - 1}}S} \right\}{{\left| S \right|}^{\left( {n - m - 1} \right)/2}}dS} = 1 \end{array}.\]

Rearranging the expression in the last display one obtains

\[\int\limits_{S > 0} {etr\left\{ { - \frac{1}{2}{\Sigma ^{ - 1}}S} \right\}{{\left| S \right|}^{\left( {n - m - 1} \right)/2}}dS} = {2^{nm/2}}{\left| \Sigma \right|^{n/2}}{\Gamma _m}\left( {n/2} \right).\] If I set $\left( {1/2} \right){\Sigma ^{ - 1}} = W$, the above integral becomes

Theorem 14. If $n > m$, then for any positive definite matrix $W$

\[\int\limits_{S > 0} {etr\left\{ { - WS} \right\}{{\left| S \right|}^{\left( {n - m - 1} \right)/2}}dS} = {\left| W \right|^{ - n/2}}{\Gamma _m}\left( {n/2} \right), \tag{6}\]

This very useful integral is called the multivariate Gamma integral.

Example 15. In a linear regression model I have that

\[\hat \beta |X\sim N\left( {\beta ,{\sigma ^2}{{\left( {X'X} \right)}^{ - 1}}} \right).\] Set $S = X'X$ so that

\[\hat \beta |S\sim N\left( {\beta ,{\sigma ^2}S} \right)\],

and suppose that $S\sim {W_k}\left( {v,{I_k}} \right)$. This assumption is sometimes done in some analyses of the linear regression model. Then, I know that \[pdf\left( {\hat \beta |S} \right) = {\left( {2\pi {\sigma ^2}} \right)^{ - k/2}}{\left| S \right|^{1/2}}\exp \left\{ { - \frac{1}{{2{\sigma ^2}}}\left( {\hat \beta - \beta } \right)'S\left( {\hat \beta - \beta } \right)} \right\},\] and \[pdf\left( S \right) = \frac{{{2^{ - vk/2}}}}{{{\Gamma _k}\left( {v/2} \right)}}etr\left\{ { - \frac{1}{2}S} \right\}{\left| S \right|^{\left( {v - k - 1} \right)/2}}.\]

The joint density of $\hat \beta$ and S is \[\begin{array}{c} pdf\left( {\hat \beta ,S} \right) = \frac{{{2^{ - vk/2}}{{\left( {2\pi {\sigma ^2}} \right)}^{ - k/2}}}}{{{\Gamma _k}\left( {v/2} \right)}}etr\left\{ { - \frac{1}{2}\left[ {{I_k} + \frac{1}{{{\sigma ^2}}}\left( {\hat \beta - \beta } \right)\left( {\hat \beta - \beta } \right)'} \right]S} \right\}\\ {\left| S \right|^{\left( {v + 1 - k - 1} \right)/2}} \end{array}.\]

So the marginal density of $\hat \beta$ can be obtained by integrating out S>0, \[\begin{array}{c} pdf\left( {\hat \beta } \right) = \frac{{{2^{ - vk/2}}{{\left( {2\pi {\sigma ^2}} \right)}^{ - k/2}}}}{{{\Gamma _k}\left( {v/2} \right)}}\\ \int\limits_{S > 0} {etr\left\{ { - \frac{1}{2}\left[ {{I_k} + \frac{1}{{{\sigma ^2}}}\left( {\hat \beta - \beta } \right)\left( {\hat \beta - \beta } \right)'} \right]S} \right\}{{\left| S \right|}^{\left( {v + 1 - k - 1} \right)/2}}dS} \end{array}.\] I know that this last integral can be evaluated and \[\begin{array}{l} pdf\left( {\hat \beta } \right) = \frac{{{2^{ - vk/2}}{{\left( {2\pi {\sigma ^2}} \right)}^{ - k/2}}{\Gamma _k}\left( {\left[ {v + 1} \right]/2} \right)}}{{{\Gamma _k}\left( {v/2} \right)}}{\left| {{I_k} + \frac{1}{{{\sigma ^2}}}\left( {\hat \beta - \beta } \right)\left( {\hat \beta - \beta } \right)'} \right|^{\left( {v + 1} \right)/2}}\\ \frac{{{2^{ - vk/2}}{{\left( {2\pi {\sigma ^2}} \right)}^{ - k/2}}{\Gamma _k}\left( {\left[ {v + 1} \right]/2} \right)}}{{{\Gamma _k}\left( {v/2} \right)}}{\left( {1 + \frac{1}{{{\sigma ^2}}}\left( {\hat \beta - \beta } \right)'\left( {\hat \beta - \beta } \right)} \right)^{\left( {v + 1} \right)/2}} \end{array}\] and this is a multivariate t distribution with v+1 degrees of freedom, location parameter $\beta$ and scale parameter ${\sigma ^2}$.

Theorem 16. Assume that $S\sim {W_m}\left( {n,\Sigma } \right)$ and A is an $m \times k$ matrix of rank $k \le m$. Then: 1. $A'SA\sim {W_k}\left( {n,A'\Sigma A} \right)$; 2. ${\left( {A'{S^{ - 1}}A} \right)^{ - 1}}\sim {W_k}\left( {n - m + k,{{\left( {A'{\Sigma ^{ - 1}}A} \right)}^{ - 1}}} \right).$

If $A$ is a random matrix independent of $S$, I can interpret the result above as conditional on $A$.

Example 17. Suppose that $x$ has any distribution independent of $S\sim {W_m}\left( {n,\Sigma } \right)$. I want the distribution of two statistics: \[{t_1} = \frac{{x'Sx}}{{x'\Sigma x}}\] and ${t_2} = \frac{{x'{\Sigma ^{ - 1}}x}}{{x'{S^{ - 1}}x}}.$ Using Theorem 3, I know that $x'Sx|x\sim {W_1}\left( {n,x'\Sigma x} \right)$ so that \[{t_1} = \frac{{x'Sx}}{{x'\Sigma x}}|x\sim {\chi ^2}\left( n \right).\] Since ${\chi ^2}\left( n \right)$ does not depend on $x$, it must be true that ${t_1}\sim {\chi ^2}\left( n \right)$ unconditionally. Likewise, \[\frac{1}{{x'{S^{ - 1}}x}}|x\sim {W_1}\left( {n - m - 1,\frac{1}{{x'{\Sigma ^{ - 1}}x}}} \right),\] so that ${t_2} = \frac{{x'{\Sigma ^{ - 1}}x}}{{x'{S^{ - 1}}x}}|x\sim {\chi ^2}\left( {n - m - 1} \right)$. Once again ${\chi ^2}\left( {n - m - 1} \right)$ does not depend on $x$, so the above result must be true unconditionally, i.e. ${t_2}\sim {\chi ^2}\left( {n - m - 1} \right)$.