The quality of parameter estimation depends both on the number of parameters to estimate (km+1 for an order m Markov model) and of the length (n) of the homogeneous sequence used for their estimation. When the same sequence (or set of sequences) is used both for observed frequencies and parameter estimation, m should not be greater than h – 2 for a pattern of length h (as else, the observed frequency of the pattern will be included in the model). As literature often suggests to use the highest possible order, it is hence common to consider m = 6 or more (for a DNA pattern of size h ≥ 8). Moreover, because of the homogeneity assumption of the model, the considered genomes have often to be segmented first. As a result, the sequences length used for parameter estimations are often dramatically reduced by such segmentation (e. g. n = 105 to n = 106 at the very best for DNA sequences). It is hence quite common to encounter high order Markov models estimated on rather short sequences which could result in high sensitivity to parameter estimation.