[link]
This paper compares methods to calculate the marginal likelihood, $p(D | \tau)$, when you have a tree topology $\tau$ and some data $D$ and you need to marginalise over the possible branch lengths $\mathbf{\theta}$ in the process of Bayesian inference. In other words, solving the following integral: $$ \int_{ [ 0, \infty ]^{2S - 3} } p(D | \mathbf{\theta}, \tau ) p( \mathbf{\theta} | \tau) d \mathbf{\theta} $$ There are some details about this problem that are common to phylogenetic problems, such as an exponential prior on the branch lengths, but otherwise this is the common problem of approximate Bayesian inference. This paper compares the following methods: * ELBO (appears to be [BBVI][]) * Gamma Laplus Importance Sampling * Variational Bayes Importance Sampling * Beta' Laplus * Gamma Laplus * Maximum un-normalized posterior probability * Maximum likelihood * Naive Monte Carlo * Bridge Sampling * Conditional Predictive Ordinates * Harmonic Mean * Stabilized Harmonic Mean * Nested Sampling * Pointwise Predictive Density * Path Sampling * Modified Path Sampling * Stepping Stone * Generalized Stepping Stone I leave the in depth description of each algorithm to the paper and appendices, although it's worth mentioning that Laplus is a Laplace approximation where the approximating distribution is constrained to be positive. Some takeaways from the empirical results: * If runtime is not a concern power posterior methods are preferred: > The power posterior methods remain the best general-purpose tools for phylogenetic modelcomparisons, though they are certainly too slow to explore the tree space produced by PT. * Bridge sampling is the next choice, if you need something faster. * Harmonic Mean is a bad estimator for phylogenetic tree problems. * Gamma Laplus is a good fast option. * Naive Monte Carlo is a poor estimator, which is probably to be expected. * Gamma Laplus is the best option for very fast algorithms: > Empirical posterior distributions on branch lengths are clearly not point-masses, and yet simply normalizing the unnormalized posterior at the maximum outperforms 6 of the 19 tested methods. All methods were compared on metrics important to phylogenetic inference, such as *average standard deviation of split frequencies" (ASDSF), which is typically used to confirm whether parallel MCMC chains are sampling from the same distribution over tree topologies. Methods were also compared on KL divergence to the true posterior and RMSD (appears to be the mean squared error between CDFs?). [bbvi]: https://arxiv.org/abs/1401.0118
Your comment:
|