Evolutionary Model
Jukes-Cantor (1969): Assumes all nucleotide substitutions occur at equal rates. Transition probability: P(i→j|t) = 0.25 + 0.75e^(-4t/3) for i=j, or 0.25 - 0.25e^(-4t/3) for i≠j
Tree Construction
UPGMA (Unweighted Pair Group Method with Arithmetic mean): Builds tree by iteratively joining closest sequences based on Hamming distances, assuming molecular clock
Branch Lengths
Estimated from pairwise Hamming distances (proportion of differing sites). Branch lengths represent evolutionary time proportional to substitutions per site
Likelihood Calculation
Felsenstein's Pruning computes conditional likelihoods at each node via post-order traversal, marginalizing over all possible ancestral states at internal nodes
Assumptions
• Sites evolve independently
• Equal base frequencies (0.25 each)
• No rate variation across sites
• Time-reversible process
Ancestral Reconstruction
Maximum likelihood states chosen at internal nodes based on highest conditional likelihood values from the pruning algorithm