Wednesday, April 29, 2009

Sampling Dirichlet Distributions

[I just realized that this post from last year was only half the story.  See this post about using the gamma distribution directly to sample Dirchlet distributions]

I just commented on a post by Andrew Gelman about methods for sampling Dirichlet distributions. Those comments were pretty non specific and deserve a bit of amplification.

First off, a Dirichlet distribution is a distribution of real-valued tuples,
\[(x_1 \ldots x_n) \sim \mathrm {Dir}(\pi_1 \ldots \pi_n) \]
such that \(x_i \ge 1\) and \(\sum_i x_i = 1\)

The parameters \(\pi_i\) are all non-negative.

The original question had to do with sampling the Dirichlet parameters, especially from a conjugate distribution. The one and true answer in mathematical terms is that there is, indeed, a continuous distribution which is the conjugate of a Dirichlet. In practical terms, however, that isn't the answer that you really want.

A much more practical answer is that the Dirichlet can be sampled from a prior that is characterized by \(n+1\) non-negative real parameters using the following procedure
\[\begin{aligned}
\left(m_1 \ldots m_n \right) & \sim \mathrm {Dir} (\beta_1 \ldots \beta_n ) \\
\alpha & \sim \exp (\beta_0) \\
\pi_i & \sim \alpha m_i
\end{aligned}
\]
Alternative distributions for \(\alpha\) include the gamma distribution and exponential normal, such as
\[ \log \alpha \sim \mathcal N (0, 2)\]