Gaps in the multiple sequence alignment are treated as missing data. Maximum likelihood parameter estimation is used to set parameters for the 10 sequence models and state transitions. Training consists of iterating through labeled examples of alternatively spliced exons and constitutive exons. Substitution rates are defined by randomly selecting values within a range of 0.5 to 2.5 and adjusting the value to maximize the F-score (=(2 × Sn × Sp)/(Sn + Sp)) in the training set, where sensitivity (Sn) is the percentage of protein coding nucleotides correctly labeled and specificity (Sp) is the percentage of predicted protein coding nucleotides correctly labeled.