MCMC sampling under an Epidemic Birth-Death model
| Param | True | Post. Mean |
|---|---|---|
| β | — | — |
| δ | — | — |
A phylogenetic tree is simulated under a stochastic birth-death process. At each event a lineage either speciates (rate β) or goes extinct (rate δ). The process runs until 20 extant lineages are produced. Branch lengths are in units of expected substitutions per site once scaled by the mutation rate μ.
Sequences evolve along branches under the Jukes–Cantor model: all nucleotide substitutions occur at equal rate μ, giving a closed-form transition probability used both in simulation and in the Felsenstein likelihood.
The log-likelihood of the observed alignment given tree topology, branch lengths, and μ is computed exactly via dynamic programming on the tree. For each site, partial likelihoods are swept from tips to the root under the JC transition matrix. This is the most computationally significant step per MCMC iteration.
The branching times (internal node heights) are scored against a Yule waiting-time density with net diversification rate r = β − δ. This term is the primary data constraint on β and δ, complementing the Felsenstein likelihood which constrains μ and branch lengths.
The posterior is explored via Metropolis–Hastings MCMC using a block-update operator schedule (following BEAST). Each iteration randomly selects one of three operators:
Block A — μ only (log-scale proposal, step 0.5).
Block B — β and δ only (log-scale proposal).
Block C — tree: NNI topology move (60%) or branch-length
scaling of 3 random edges (40%).
Block updates prevent the non-identifiable joint drift of β and μ through shared branch lengths, the same mechanism used in BEAST’s operator schedule.
A Gaussian root-age calibration prior (analogous to a fossil calibration in BEAST) anchors the absolute time scale — without it β and μ are non-identifiable. The posterior is summarised post burn-in (last 50%) using:
ESS — effective sample size via integrated autocorrelation
time; values <200 indicate the chain needs more iterations.
95% HPD — the shortest interval containing 95% of the
post-burnin samples.
Limitations: This demo uses a Yule approximation (ignoring full birth-death conditioning), JC69 with no among-site rate variation, and a calibration that uses the known true root age rather than a genuine fossil prior. Real analyses (e.g. in BEAST2 or MrBayes) use the full Stadler FBD likelihood, GTR+Γ substitution models, and empirical fossil priors.