| |
|
|
.:: Home ::. |
| |
|
Protein World Database
|
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons we have performed radical all-against-all com-parisons of almost four million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm.
The resulting database, ProteinWorldDB, is the first product of the Genome Comparison Project, a research project of Fiocruz and PUC-RJ, sponsored by IBM and the World Community Grid. Some of its functionalities include retrieval of identifiers, annotation, ontology terms and protein domains. Execution of similarity searches using BLAST is also possible, as well as retrieval of whole pairwise genome comparisons. Other features include selection of unique genes and protein clusters (pre-processed and stored in the database).
ProteinWorldDB offers unlimited access for the scientific community. Our goal is to provide a database where one is able to mine comparative data accurately calculated and use this information - similarity scores, statistical estimates, functional annotation, and sequence properties - as a starting point for subsequent analysis.
|
|
| |
| |
|