|Universität Tübingen||Fakultät > Wilhelm-Schickard-Institut > Algorithmik > Forschung > Abgeschlossene Arbeiten > Sequenzanalyse Algorithmen auf Clustern|
ParSeq: A Software Tool for Searching Motifs with Structural and Biochemical Properties in Biological Sequences
Project and MembersThe following departments of the university of Tuebingen are involved in the research project, which is supported by the "Landesschwerpunktprogramm" of Baden-Wuerttemberg, Germany.
AbstractSearches for variable motifs, like protein binding sites or promotor regions are more complex than the search for casual motifs. On amino acid sequences comparing motifs alone mostly proves to be insufficient to detect regions which represent proteins with a special function, because the function depends on biochemical properties of individual amino acids (such as polarity, hydrophobicity, and electric charge). Pure string matching programs are not able to find these motifs. Hence, we propose a software tool that combines the search for motifs with certain structural properties, the verification of biochemical properties, and an approximate search mechanism. Because it is very difficult to describe such motifs exactly, the tool supports a step by step creation of this description by allowing to search on previously obtained results. The description itself is a query language based on regular expressions and extended by the possibility to formulate conditions on biochemical properties. In order to be useful in practice, the response time must be within seconds or, in the worst case, minutes, to be acceptable for users. By intelligently distributing the computation over a number of machines, the response time can be sufficiently reduced. In this project, parallel sequence analysis algorithms are developed. The algorithms are planned to run as a service on the Kepler Cluster (depicted on the above photo), a highly parallel cluster (98 Dual Pentium III PCs nodes with a Myrinet interconnect), located at the University of Tübingen.
The actual version of ParSeq can be used to make searches on your local computer using raw sequence files. Within the next months, we will provide the possibility to integrate remote computer capacities like e.g. a parallel computer like the Kepler-Cluster or a casual workstation-pool. The user will be able to start either a local or a remote search session from the same user interface. If you want to test the possibilities of searching motifs with our program, you will find a Java-Web-Start link at the end of this page.
A screenshot of the GUI for sequence analysis
Bioinformatics Applications Note (Advance Access)The ParSeq Software was published as an Applications Note in Bioinformatics, see Abstract.
DownloadOn this site you can download or start the first version of ParSeq that was published in the Bioinformatics Journal. If you are interested in using the most actual version, please follow this link -> latest version. Otherwise, if you want to use the version described in the Bioinformatics Applications Note then please use the software provided below.
The software is deployed using Java Web Start technology. Please refer to http://java.sun.com/products/javawebstart/ for more information about Java Web Start.
Sequential version of ParSeq (Java Web Start Application) (requires Java 1.4 or higher)