Wednesday, April 29, 2009

Sampling Dirichlet Distributions

[I just realized that this post from last year was only half the story.  See this post about using the gamma distribution directly to sample Dirchlet distributions]

I just commented on a post by Andrew Gelman about methods for sampling Dirichlet distributions. Those comments were pretty non specific and deserve a bit of amplification.

First off, a Dirichlet distribution is a distribution of real-valued tuples,
\[(x_1 \ldots x_n) \sim \mathrm {Dir}(\pi_1 \ldots \pi_n) \]
such that \(x_i \ge 1\) and \(\sum_i x_i = 1\)

The parameters \(\pi_i\) are all non-negative.

The original question had to do with sampling the Dirichlet parameters, especially from a conjugate distribution. The one and true answer in mathematical terms is that there is, indeed, a continuous distribution which is the conjugate of a Dirichlet. In practical terms, however, that isn't the answer that you really want.

A much more practical answer is that the Dirichlet can be sampled from a prior that is characterized by \(n+1\) non-negative real parameters using the following procedure
\[\begin{aligned}
\left(m_1 \ldots m_n \right) & \sim \mathrm {Dir} (\beta_1 \ldots \beta_n ) \\
\alpha & \sim \exp (\beta_0) \\
\pi_i & \sim \alpha m_i
\end{aligned}
\]
Alternative distributions for \(\alpha\) include the gamma distribution and exponential normal, such as
\[ \log \alpha \sim \mathcal N (0, 2)\]

6 comments:

Unknown said...

This seems like a promising approach, but how do you compute the MAP dirichlet parameters, given observed data (probability vectors) ? Even better, how do you sample from the posterior ??

thanks

Andy said...

Related to the post on Andrew Gelman's blog - Is there a way to take data into account? Can one derive a posterior sampling scheme for the Dirichlet parameter based on the prior sampling scheme you propose here?
Do you know any references for this material besides 'The Bayesian Choice'?

Thank you

Ted Dunning ... apparently Bayesian said...

Andrei,

Can you be a bit more specific? It is pretty straightforward if you are sampling from a multinomial whose parameters are Dirichlet distributed, but I think you have in mind something more interesting.

Andy said...

The problem I am trying to solve is the following:

Assume x - Dir(\pi). From what I understand you present a sampling scheme for a prior on \pi:
m - Dir(\beta)
\alpha - exp(\beta_0)
\pi=\alpha m

Can one derive the the posterior distribution of \pi | x in terms of \beta; \beta_0 and x?

Where: x,\pi,\beta,m are vector and \beta_0,\alpha are scalars.

I hope this makes my question clear.
Thank you

Ted Dunning ... apparently Bayesian said...

I just started a series of postings that will answer these questions.

Jeroen Janssens said...

Dear Ted,

Thank you for posting this. My question may be related to Andrei's and Scott's.

I'm implementing a Gibbs sampler for a model which has N Dirichlet distributions (\pi) of which the parameters are like you describe in your post: another Dirichlet (m) and a multiplier which is exponential or gamma (\alpha). I have noticed that the posterior distribution of this multiplier (\alpha) is gamma, and I was wondering whether both its shape and scale parameters could be derived given m and the N \pi's (and the prior of \alpha). Do you have any idea if this is possible? Thank you.