Robetta is a protein structure prediction server developed by the Baker lab at the University of Washington. At it's core is the Rosetta macromolecular modeling suite developed by the Rosetta Commons, a multi-institutional collaborative research and software development group. Robetta's primary service is to predict the 3-dimensional structure of a protein given the amino acid sequence.
Five options are provided for structure prediction: (1) A deep learning based method, RoseTTAFold, (2) A deep learning based method, TrRosetta, (3) Rosetta Comparative Modeling (RosettaCM), (4) Rosetta Ab Initio (RosettaAB), and (5) a fully automated pipeline that first predicts domains as independent folding units, models each unit with (3) or (4), and then assembles them into full chain models.
For the RosettaCM protocol, 4 independent methods are used to detect templates and generate sequence alignments:
Morten Källberg, Haipeng Wang, Sheng Wang, Jian Peng, Zhiyong Wang, Hui Lu & Jinbo Xu. Template-based protein structure modeling using the RaptorX web server. Nature Protocols 7, 1511-1522, 2012.
Söding J. (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951-960. doi:10.1093/bioinformatics/bti125.
Yuedong Yang, Eshel Faraggi, Huiying Zhao, Yaoqi Zhou. Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of the query and corresponding native properties of templates. Bioinformatics 27:2076-82(2011)
S Ovchinnikov et al. Protein Structure Determination using Metagenome sequence data. (2017) Science. 355(6322):294–8.
Users have the option to forgo these methods by providing their own template(s) and alignment(s), and to optionally modify the alignment(s), add custom constraints, and more through an interactive interface on the submit page.
Computing resources for this service are provided by the Baker lab through a local cluster, HHMI's Janelia Research Campus through a satellite gpu cluster, and by volunteers from the distributed computing project, Rosetta@home.
Robetta is continually evaluated through CAMEO (Continuous Automated Model EvaluatOn) which evaluates up to 20 pre-released Protein Data Bank (PDB) targets each week. Through a 1 year period and over 900 CAMEO pre-released PDB targets, Robetta averaged an lDDT (Local Distance Difference Test) score of around 69. lDDT is a superposition independent score that evaluates interatomic distances with values ranging from 0 (bad) to 100 (good).
Robetta's accuracy mainly depends on whether similar sequences (homologs) exist in available sequence databases (UniProt and Uniclust) and the PDB. A predicted confidence value which takes this into account and was found to correlate with the actual GDT to native is provided for comparative modeling domains and described in the RosettaCM publication's Supplementary Information. For ab initio domains, a predicted confidence value is provided that corresponds to the average pairwise TM-score of the top 10 Rosetta scoring models and is described in the Supplementary Materials for this publication.
Per residue local error estimates are provided in a plot and the b-factor column of model coordinates as predicted distances (in Å) between the Cα positions of the model compared to the native structure. You may color models by the error estimate and download coordinates with a range of less than 1 to 5 Å error. The error estimates are continually evaluated through CAMEO and typically average a model confidence score of 0.85 and a model confidence lDDT score of 0.82. The latter is superposition independent. The error estimates are based on structural variation within model clusters and therefore are not calculated if only 1 model is selected for sampling.
It typically takes one to a few days to complete a job, but there are many factors that may affect the run time such as the length of your sequence, the number of predicted domains, and the number of jobs that are already queued and active. If you use TrRosetta or provide your own template(s) and alignment(s) for comparative modeling, it may take less than one day. For very large, multi-domain domain predictions, it may take over a week and requires manual intervention as described below.
To help prevent exceptionally long queues from occuring, users are only allowed to model one domain at a time. If you choose to "Predict domains", you must manually select the domain you want to model after the domain boundaries have been predicted.
An example is available to view here.
Results become available throughout the modeling pipeline process and can be accessed from the My queue page by clicking on the Job ID of interest. A gzipped tar file of the raw results and inputs which may include PDB templates, alignments, fragment files, constraints, commands, models, and more is available for each completed domain from the "Download Results Archive" link at the top of the domain results page.
My queue -> Job id -> Domain id -> Download Results Archive
Additionally, models are emailed to your registered email address when they become available.
For rare cases when there is an unrecoverable error, usually due to corrupted user input data, jobs will be marked with an Error status. If you would like details about the error or think that it may be due to a bug in our pipeline, please contact us.
Yes, Robetta provides an interactive interface on the submit page where you can upload your own PDB template coordinates and an optional alignment. The alignment should be global, in FASTA format, and placed before the PDB template coordinates as shown in this example. You can load multiple templates, modify the alignments, add constraints, and more before submitting.
Yes, Robetta models homo-oligomeric complexes if it detects symmetry from a template's biological unit when using the comparative modeling (CM) method. More information is provided in this publication.
Robetta can also model hetero-oligomeric complexes but only with the RoseTTAFold and Comparative Modeling (CM) options by using a forward slash "/" between chain sequences when providing your sequence on the submit page. Complexes can be modeled using CM or RoseTTAFold only; docking and ab initio methods are not currently used.
For RoseTTAFold, a multiple sequence alignment of chain sequences paired for optimal co-evolution signal must be provided in A3M format. Alignments should NOT include a forward slash between chain sequences. For more information, please visit https://github.com/RosettaCommons/RoseTTAFold/tree/main/example/complex_modeling.
For CM, template(s) and alignment(s) may be provided by the user, and custom inter-chain constraints may be applied. Alignments should also include a forward slash "/" between chain sequences. Please note that comparative modeling hetero-oligomeric complexes has not been thoroughly tested and benchmarked.
Currently, Robetta cannot model protein ligand complexes. However, protein ligand complexes can be loaded as templates into the submit page and custom constraints can easily be applied to the positions that interact with the ligand to constrain the binding site. Ligands are omitted upon submitting.
RoseTTAFold
TrRosetta and DeepAccNet
Rosetta Comparative Modeling and Ab Initio Modeling
Alignment Methods (Generously provided and supported by the following groups)
Morten Källberg, Haipeng Wang, Sheng Wang, Jian Peng, Zhiyong Wang, Hui Lu & Jinbo Xu. Template-based protein structure modeling using the RaptorX web server. Nature Protocols 7, 1511-1522, 2012.
Söding J. (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951-960. doi:10.1093/bioinformatics/bti125.
Yuedong Yang, Eshel Faraggi, Huiying Zhao, Yaoqi Zhou. Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of the query and corresponding native properties of templates. Bioinformatics 27:2076-82(2011)
S Ovchinnikov et al. Protein Structure Determination using Metagenome sequence data. (2017) Science. 355(6322):294–8.
Baker Lab | Rosetta@home | Contact | Terms of Service
©2024 University of Washington