TY - JOUR
T1 - A computational tool for the genomic identification of regions of unusual compositional properties and its utilization in the detection of horizontally transferred sequences
AU - Putonti, Catherine
AU - Luo, Yi
AU - Katili, Charles
AU - Chumakov, Sergey
AU - Fox, George E.
AU - Graur, Dan
AU - Fofanov, Yuriy
PY - 2006/10
Y1 - 2006/10
N2 - Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.
AB - Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.
KW - Escherichia coli K12
KW - Escherichia coli O157:H7
KW - Horizontal (lateral) gene transfer
KW - Sequence composition
UR - http://www.scopus.com/inward/record.url?scp=33748778231&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33748778231&partnerID=8YFLogxK
U2 - 10.1093/molbev/msl053
DO - 10.1093/molbev/msl053
M3 - Article
C2 - 16829541
AN - SCOPUS:33748778231
SN - 0737-4038
VL - 23
SP - 1863
EP - 1868
JO - Molecular biology and evolution
JF - Molecular biology and evolution
IS - 10
ER -