Discussion Recent research has demonstrated very clearly that the rate of emergence of human pathogens continues to increasing steadily, and that the source of the majority of these agents is wildlife [1]. One of the most intriguing findings of late is that bats are the natural reservoirs of many of the most pathogenic viruses in humans. While it is known that microbe-host coevolution drives pathogenicity in the natural host, the effect of such coevolution on alternative hosts has not been described. The development of many genome sequencing projects extending beyond domesticated animals provides an opportunity to begin such inquiries. The low coverage levels of these projects and the fact that so many genes with immunological functions appear in large families of very similar genes requires the development of more precise inferential tools for their study. Toward that end, we have developed a method for the assembly of genome sequence fragments for use in the inference of gene family members when the genome coverage is too low for reliable complete assembly. We validated the method on raw, unassembled traces from the human genome sequencing project, finding that we could assemble the known interferon alpha sequences accurately, and further, identify at least one case in which two alleles are present. It should be noted that our method requires more computation than assembly methods currently in use due to the numerical minimization over the mutation frequency required in Eq(10), and is therefore not suitable for large-scale assembly. Using this method, we inferred a total of 61 type-I interferon genes with intact ORFs from the whole-genome shotgun trace archives for two chiropteran species, Pteropus vampyrus and Myotis lucifugus. We find that the largest of the IFN-I gene families in both bats comprises genes orthologous to the IFNW genes in other mammals. In humans, mice and pigs, there is just one IFNW gene but up to two dozen members in each bat. A recent analysis of bovine type-I interferons from the assembled Bos taurus genome [32] finds 26 intact IFNW genes, providing precedent for our otherwise striking results. In contrast, the IFNA family is the largest IFN-I family in several mammals, including humans, mice and pigs, but appears to be smaller in Pteropus and absent but for pseudogenes in Myotis. The gene family assembly from trace archives indicates that there are 7 intact IFNA in Pteropus; analysis of direct sequencing from PBMCs gives maximum posterior probability to the presence of 9 IFNA genes (Figure 4), with a 90% credible interval containing 6 to 16 genes. Cattle have 13 IFNA genes in spite of having a greatly enlarged IFNW family [32]. Both bats have multiple members in the IFND family with five intact members and seven pseudogenes in Pteropus, and twelve intact members in Myotis. IFND has been found as a functional gene only in pigs, where the gene product is expressed in the placenta and plays a role in embryonic development [33], but is not suspected of involvement in the response to viruses. It is striking to us that this family seems to have been so dramatically enlarged in bats. The size of this family suggests that it may still be involved in host defense in bats even if it has lost that function in pigs. Walker and Roberts [32] report finding three IFND pseudogenes, but no intact IFND in the cow. It is worth noting that the placental type-I interferon in cattle is IFNT rather than IFND [34]. Searching the bat trace archives with bovine IFNT did not produce hits that had not been returned with the other searches. We find no evidence of IFNT or IFNZ in either bat. For IFNB, IFNK, and IFNE, we find one member of each in P. vampyrus, and one or two in M. lucifugus. In the case where we do find two genes, we are confident that there the genes are distinct, though they may represent alleles rather than paralogs. We used the inferred sequences for P. vampyrus IFNA, IFNB, IFND, and IFNK to design oligonucleotide primers for cloning and sequencing and recovered a total of 110 sequences. The directly cloned sequences validate much of the gene family inferences; most of the repeated sequences are within a few bases of the nearest inferred gene. A minority of the directly sequenced genes are surprisingly far from the nearest inferred gene. This circumstance may be an indication that there are additional type-I interferon genes not covered by the Pteropus sequencing traces, and may also reflect significant population polymorphism in the wild bat population. We used these same primers to show, using quantitative RT-PRC, that the P. vampyrus candidate IFNB and OAS2 (a gene induced by IFNB in other mammals), are expressed upon stimulation by type-I inducing agents. Furthermore, the temporal trajectories of this expression is consistent with the known mechanisms of such signaling. The expression of IFNB under viral infection was delayed compared to that under stimulation by the TLR ligands poly(I:C) and LPS.