Finding approximate gene clusters with Gecko 3
Winter, Sascha ;  Jahn, Katharina ;  Wehner, Stefanie ;  Kuchenbecker, Leon ;  Marz, Manja ;  Stoye, Jens ;  Boecker, Sebastian

Seitenzahl11 S.
Freie Schlagwörtergenes; genome; gene cluster; gecko; datasets
DDC610 Medizin und Gesundheit
Auch erschienen inNucleic Acids Res. - 44 (2016), 20, Artikel Nr. 9600-9610
ZusammenfassungGene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min.
Fachbereich/EinrichtungMedizinische Fakultät Charité - Universitätsmedizin Berlin
Dokumententyp/-SammlungenWissenschaftlicher Artikel
RechteCreative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
