Connect-the-Dots Connect-the-Dots connects identifiers for genes and other entities based on information extracted from multiple data sources. It provides methods for parsing data sources to extract identifiers and connections among identifiers, and loading this information into an internal database. Users can query the database to connect identifiers from any number of sources by following paths composed of the parsed connections. For example, to find literature citations about genes of interest on an Affymetrix chip, a query can connect Affymetrix probeset identifiers to LocusLink identifiers using information from Affymetrix's annotation files and connect the LocusLink identifiers to PubMed identifiers using information in NCBI's LocusLink files. Longer and more complex paths are also possible. Queries are expressed in a special-purpose query language and are translated into SQL by the software. The system can be used interactively over the Web, or as a batch resource to create specialized translation tables for specific purposes. Many of the translation tables used internally by T1DBase are constructed in this manner. The current Connect-the-Dots database has information from LocusLink, UniGene (human, mouse and rat), OMIM, IPI, UniProt, HomoloGene, DoTS, several Affymetrix chips, and human and mouse PancChips (pancreas/islet-specific microarrays). The database contains 20 million unique identifiers and 42 million connections extracted from 2 million data source entries.