Bayesian Phylogenetic Inference

MCMC sampling under an Epidemic Birth-Death model

or load FASTA
no file chosen
Fixed Parameters & Calibration
Root age (calibration):
Root age σ: ±20%
ρ (extant samp.): 1.0
ψ (fossil rate): 0.0
r (removal prob): 1.0

Simulated True Tree

Current Sampled Tree RF dist: —

Tree Uncertainty (DensiTree) 0 trees

MCMC Status

Generation: 0
Acceptance rate: —
Log posterior: —
Current β: —
Current δ: —

Parameter Recovery

ParamTruePost. Mean
β
δ
T, ρ, ψ, r held fixed

Trace — β (Birth) & δ (Death)

Posterior — β (Birth)

Posterior — δ (Death)

Posterior — μ (Mutation Rate)

Alignment (first 50 bp)


                
Methods & References

1 — Birth-Death Tree Simulation

A phylogenetic tree is simulated under a stochastic birth-death process. At each event a lineage either speciates (rate β) or goes extinct (rate δ). The process runs until 20 extant lineages are produced. Branch lengths are in units of expected substitutions per site once scaled by the mutation rate μ.

dN/dt = (β − δ) N
r = β − δ (net diversification)
Kendall (1948) Ann. Math. Stat. 19:1–15.
Nee, May & Harvey (1994) Phil. Trans. R. Soc. B 344:77–83.
Stadler (2010) J. Theor. Biol. 267:396–408.

2 — Sequence Evolution (JC69 Model)

Sequences evolve along branches under the Jukes–Cantor model: all nucleotide substitutions occur at equal rate μ, giving a closed-form transition probability used both in simulation and in the Felsenstein likelihood.

P(same | t) = ¼ + ¾ e−μt
P(diff | t) = ¼ − ¼ e−μt
Jukes & Cantor (1969) in Mammalian Protein Metabolism pp. 21–132.

3 — Felsenstein Pruning Algorithm

The log-likelihood of the observed alignment given tree topology, branch lengths, and μ is computed exactly via dynamic programming on the tree. For each site, partial likelihoods are swept from tips to the root under the JC transition matrix. This is the most computationally significant step per MCMC iteration.

Li(s) = ∏ct P(t|s, blc) Lc(t)
Felsenstein (1981) J. Mol. Evol. 17:368–376.

4 — Yule / Birth-Death Tree Likelihood

The branching times (internal node heights) are scored against a Yule waiting-time density with net diversification rate r = β − δ. This term is the primary data constraint on β and δ, complementing the Felsenstein likelihood which constrains μ and branch lengths.

log P(T | β, δ) = ∑i [ log(i·r) − i·r·Δti ]
Nee et al. (1994) as above.
Rannala & Yang (1996) J. Mol. Evol. 43:304–311.

5 — MCMC with Block Updates

The posterior is explored via Metropolis–Hastings MCMC using a block-update operator schedule (following BEAST). Each iteration randomly selects one of three operators:

Block A — μ only (log-scale proposal, step 0.5).
Block B — β and δ only (log-scale proposal).
Block C — tree: NNI topology move (60%) or branch-length scaling of 3 random edges (40%).

Block updates prevent the non-identifiable joint drift of β and μ through shared branch lengths, the same mechanism used in BEAST’s operator schedule.

Metropolis et al. (1953) J. Chem. Phys. 21:1087.
Hastings (1970) Biometrika 57:97–109.
Drummond et al. (2002) Genetics 161:1307–1320.
Drummond & Rambaut (2007) BMC Evol. Biol. 7:214.

6 — Root-Age Calibration & Convergence Diagnostics

A Gaussian root-age calibration prior (analogous to a fossil calibration in BEAST) anchors the absolute time scale — without it β and μ are non-identifiable. The posterior is summarised post burn-in (last 50%) using:

ESS — effective sample size via integrated autocorrelation time; values <200 indicate the chain needs more iterations.
95% HPD — the shortest interval containing 95% of the post-burnin samples.

Drummond et al. (2006) PLOS Biology 4:e88.
Bouckaert et al. (2019) PLOS Comput. Biol. 15:e1006650.
Robinson & Foulds (1981) Math. Biosci. 53:131–147 (RF distance).
Bouckaert (2010) Bioinformatics 26:1372–1373 (DensiTree).

Limitations: This demo uses a Yule approximation (ignoring full birth-death conditioning), JC69 with no among-site rate variation, and a calibration that uses the known true root age rather than a genuine fossil prior. Real analyses (e.g. in BEAST2 or MrBayes) use the full Stadler FBD likelihood, GTR+Γ substitution models, and empirical fossil priors.