Nucleic Acids Research Advance Access originally published online on July 9, 2009
Nucleic Acids Research 2009 37(16):5255-5266; doi:10.1093/nar/gkp576
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2009, Vol. 37, No. 16 5255-5266
© 2009 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
Detection of genomic islands via segmental genome heterogeneity
1Department of Computer Science, University of California San Diego, 9500 Gilman Drive; Mail Code 0404 La Jolla, CA 92093, 2Department of Biological Sciences, University of Pittsburgh Pittsburgh, PA 15260, 3Keck Graduate Institute of Applied Life Sciences, 535 Watson Drive Claremont and 4School of Mathematical Sciences Claremont Graduate University 711 North College Avenue Claremont, CA 91711, USA
*To whom correspondence should be addressed. Tel: +1 412 624 4204; Fax: +1 412 624 4759; Email: jlawrenc{at}pitt.edu
Received April 30, 2009. Revised June 19, 2009. Accepted June 22, 2009.
While the recognition of genomic islands can be a powerful mechanism for identifying genes that distinguish related bacteria, few methods have been developed to identify them specifically. Rather, identification of islands often begins with cataloging individual genes likely to have been recently introduced into the genome; regions with many putative alien genes are then examined for other features suggestive of recent acquisition of a large genomic region. When few phylogenetic relatives are available, the identification of alien genes relies on their atypical features relative to the bulk of the genes in the genome. The weakness of these bottom–up approaches lies in the difficulty in identifying robustly those genes which are atypical, or phylogenetically restricted, due to recent foreign ancestry. Herein, we apply an alternative top–down approach where bacterial genomes are recursively divided into progressively smaller regions, each with uniform composition. In this way, large chromosomal regions with atypical features are identified with high confidence due to the simultaneous analysis of multiple genes. This approach is based on a generalized divergence measure to quantify the compositional difference between segments in a hypothesis-testing framework. We tested the proposed genome island prediction algorithm on both artificial chimeric genomes and genuine bacterial genomes.