Research Paper

Using whole genome presence/absence data to untangle function in 12 Drosophila genomes

Volume 2, Issue 6   November/December 2008
Pages 291 - 299
Authors: Jeffrey A. Rosenfeld, Ernest K. Lee, Patrick M. O'Grady and Rob DeSalle

View affiliations

Abstract:
The Drosophila 12 genome data set was used to construct whole genome, gene family presence/absence matrices using a broad range of E value cutoffs as criteria for gene family inclusion. The various matrices generated behave differently in phylogenetic analyses as a function of the e-value employed. Based on an optimality criterion that maximizes internal corroboration of information, we show that values of e-105 to e-125 extract the most internally consistent phylogenetic signal. Functional class of most genes and gene families can be accurately determined based on the D. melanogaster genome annotation. We used the gene ontology (GO) system to create partitions based on gene function. Several measures of phylogenetic congruence (diagnosis, consistency, partitioned support , hidden support) for different higher and lower level GO categories, were used to mine the data set for genes and gene families that show strong agreement or disagreement with the overall combined phylogenetic hypothesis. We propose that measures of phylogenetic congruence can be used as criteria to identify loci with related GO terms that have a significant impact on cladogenesis.

Received: July 15, 2008; Accepted: November 24, 2008

Preview:




Advertisements