Discussion In this work, we applied computational analyses to explore the genome content and genetic diversity among the recently sequenced C. pneumoniae koala LPCoLN genome and previously published C. pneumoniae human genomes (AR39, CWL029, J138 and TW183). The koala LPCoLN genome is larger than all four C. pneumoniae human genomes by 10-12 kbp. We combined BLAST search methods and motif analysis for use in elucidating the relationship between gene function and evolution. Even though these techniques have several limitations [40], our comparative approach has (i) identified genome plasticity, (ii) provided circumstantial evidence for the presumed direction of C. pneumoniae evolution, and (iii) suggested targets for detection and differentiation of C. pneumoniae isolates from both human and animal origins. The presence of unique insertions/deletions is evidence of evolution in action. For the majority of these genes, the C. pneumoniae koala genome has the full-length version. These length polymorphisms suggest that the presumed functional changes are brought about by adaptation to a specialised niche, where the ancestral gene function may no longer be required. Therefore, it is important to understand how these genetic differences may influence differences in pathogenicity and fitness in the host. Our data supports the findings of Rattei et al. [41] and the whole genome findings of Myers et al. [24] in that the essentially clonal human isolates have evolved from an animal strain(s) that has adapted to humans through fragmentation, decay and loss-of-function processes whereby the activity of the gene product may be reduced or specialised. Hence, the koala LPCoLN genome seems to be an 'older' strain in this sense. In addition, we provide new information on strain diversity and have identified targets for detection and further investigation. Our analysis revealed a total of 140 genes that were specific to C. pneumoniae. One hundred and twenty-three of these represented hypothetical genes with no significant similarity to genes in other organisms present in the database. Further analysis of these hypothetical genes (subcellular localisation, gene expression analysis, functional profiling from microarray analysis) may reveal undiscovered biovars or subspecies in C. pneumoniae. The Pmp family is characterised by an unusual degree of sequence polymorphism, including mutations and large indels across all species [42-47] and showed variation within C. penumoniae. This suggests that the pmp gene family is subjected to high selective pressure (niche, host-specific or immune-mediated), correlating with a relatively faster evolutionary rate for these antigens. Taken together, the polymorphism of pmp sequences in C. pneumoniae from humans and animals is dually consistent with the divergent evolution of the pmp genes under host-specific selection while maintaining the capacity to adapt to specific niches or immune responses in the two different hosts. In light of the variation seen between other families of genes and their orthologs in the human isolates, namely the pmp protein family, the strict conservation of T3S effector genes was initially surprising given the effectors' normal tendency for divergence. While the differences observed between other regions of the genomes are consistent with evolutionary changes [24], the relative conservation of effector genes over equivalent time suggest that changes in genes encoding effector proteins were likely selected against. This is consistent with a key role of T3S effectors in mediating steps of the biology of C. pneumoniae that are conserved in human and animal strains, such as inclusion and intracellular development. Overall, these results support a critical role for T3S less in the virulence than in the developmental biology of these organisms. Such a role has been proposed in the context of the contact-dependent T3S-mediated hypothesis of chlamydial development proposed earlier [48,49]. Orthologs of the MACPF were identified in several chlamydial species. The first biological characterisation of the C. trachomatis MACPF by Taylor et al. [50] has revealed that the MACPF (CT153) might be activated by proteolytic processing and may play a role in the acquisition or modification of host-derived lipids. By contrast, studies of the MACPF in other organisms, including that of Toxoplasma spp. have shown that ablation of the MACPF (termed TgPLP1) resulted in a reduction in virulence (in mice), whereby TgPLP1 deficient parasites were unable to exit normally and were entrapped within host cells, due to the inability to permeabilise the parasitophorous vacuole membrane [51]. If the chlamydial MACPF was to play a similar role in egression or virulence, then why have several species failed to retain this gene? Non-lytic family members have also been identified in other organisms including Astrotactin involved in neural migration in mammals [52], a Drosophila torso-like protein involved in embryonic development [53] and Plu-MACPF of Photorhabdus luminescens which binds to the surface of insect cells [54]. Further investigation of this gene should provide more insight into its role in Chlamydiaceae. C. pneumoniae is the only chlamydial species thus far to have a udk gene encoding uridine kinase. The udk gene is a pyrimidine ribonucleoside kinase that phosphorylates uridine and cytidine into uridine or cytidine monophosphate (UMP/CMP) [55], and is highly conserved in the species. It has been reported that the Prevotella bryantii genome encodes a putative uracil DNA glycosylase and uridine kinase, likely to be involved in the removal of misincorporated uracil from DNA and its subsequent re-use [56]. All chlamydial genomes also encode a uracil DNA glycosylase, however, C. pneumoniae is the only species carrying the udk ortholog. This implies that an alternative gene product is involved in UMP production in the other chlamydial species. The C. muridarum genome includes a CDS (upp) encoding a uracil phosphoribosyltransferase [25] that may represent the main pathway for UMP production in this species. C. pneumoniae is unusual in having a very broad host range and therefore the fact that it is the only chlamydial species to have retained the udk gene could reflect this broad host capacity. Most bacteria can salvage or synthesise their own purines and pyrimidines. By contrast, chlamydiae and rickettsiae (another obligate intracellular bacterium) are incapable of de novo synthesis, and to a degree, of salvage [31,57]. Given the absence of genes for enzymes upstream in the pyrimidine biosynthesis pathway, it is unclear why the pyrE should be retained. The final step in the pathway is via pyrF which is absent from the chlamydial genome, suggesting that they are unable to convert orotate to UMP. The presence in all chlamydial genomes of orthologs encoding the three downstream enzymes involved in UMP to CTP conversion and the earlier demonstration of CTP synthetase activity in these organisms [58,59] confirm that chlamydiae are not auxotrophic for CTP. Furthermore, a three-gene cluster including guaB, guaA and add have been selectively maintained in several chlamydial species including C. pneumoniae AR39, CWL029, TW183 and J138, C. felis Fe/C-56, C. caviae GPIC and C. muridarum Nigg. C. abortus has a guaB pseudogene, whereas the C. pneumoniae LPCoLN and C. trachomatis serovar A/HAR, B/Jali20/OT, D/UQ-3/CX, L2b/UCH-1/proctititis, L2/434/Bu and Candidatus Protochlamydia amoebophila UWE25 genomes lack all three genes [25,26,30,35,36,60]. The selective loss of guaBA-add from C. pneumoniae koala LPCoLN and other chlamydial species suggest that these three enzymes required for inter-conversion of GMP, IMP and AMP must be acquired by other means or are clearly not essential for species survival. Copy number variations of the tyrP (tryptophan tyrosine permease) gene have been suggested to reflect vascular tropism and pathogenicity among C. pneumoniae human isolates with multiple copies associated with respiratory infection and single copy more frequently associated with vascular tropism [33]. A comparison of the five sequenced C. pneumoniae genomes also revealed variations in the tyrP (tryptophan tyrosine permease) copy number that are, however, inconsistent with the hypothesis by Gieffers et al. [33]. Among these, two respiratory isolates (koala LPCoLN and human J138) [24,30], as well as the single conjunctival isolate (TW183) of the group (Geng MM, Schuhmacher A, Muehldorfer I, Bensch KW, Schaefer KP, Schneider S, Pohl T, Essig A, Marre R, Melchers K: The genome sequence of Chlamydia pneumoniae TW183 and comparison with other Chlamydia strains based on whole genome sequence analysis, submitted) have a single tyrP copy while two other respiratory isolates (CWL029 and AR39) [25,26] carry duplicate copies of tyrP. The loss and fragmentation of pre-existing genes during evolution is one of the primary distinguishing features between C. pneumoniae koala and C. pneumoniae human. Extrachromosomal plasmids have been identified in six of the nine chlamydial species. As the plasmid is not common to C. pneumoniae, it is not known why koala LPCoLN has a plasmid. While proteins with predicted or known biologic function are favoured candidate gene targets, many C. pneumoniae-specific hypothetical proteins with no predicted function were identified in the comparisons and may be worth further investigating for their potential role in host tropism, pathogenicity and niche adaptation (see Additional file 2 for the list of genes). A suggested list of target genes for C. pneumoniae detection and a brief description of their characteristics is summarised in additional file 11. Selected genes include (i) C. pneumoniae-specific genes for detection of C. pneumoniae, (ii) genes that could potentially differentiate isolates from human and animal origins, for example, length polymorphic genes including the membrane attack complex perforin and the hypothetical protein CPK_ORF00679, (iii) genes for the identification of a C. pneumoniae plasmid.