Selection of uncharacterized domain families The characterization state of each protein domain is dynamic, dependent both on the available experimental literature and the perspective of the observing scientist. Using the Pfam database [7], we extracted approximately 3000 protein domain families for which we judged minimal biochemical annotation to be available (hence the name NovelFam3000). We limited our search to protein families present in genes from three metazoan genomes (worm, fly, and human), for which there were multiple human protein members. Applying these criteria, we extracted 2785 Pfam-B domain families and 127 families of Domains of Unknown Function (DUFs). The Pfam-B and DUF classes are distinguished by the level of human curation, as Pfam-B domains represent purely computational analysis and DUFs have been subjected to curator review. Of these domains, 892 (32%) of selected Pfam-B domains and 59 (46%) selected DUFs included at least one yeast protein member.