General FAQ
- Where do I begin if I want to get an Evolutionary Trace Analysis of a protein structure?
- How do I obtain an Evolutionary Trace analysis of a protein sequence that may not have a crystal (PDB) structure?
- I want to start an Evolutionary Trace analysis using my own multiple sequence alignment. Should the query sequence (the sequence onto which the ET ranks are to be mapped) be included in the alignment?
UET FAQ
(We thank the anonymous reviewers for providing or motivating these questions.)- How do I highlight a range of residues at once in the sequence window? Highlighting positions in the sequence window turns on the space representation of highlighted residues, but it only seems to work for single residues, not for ranges.
- There are no easy ways to communicate with the JSmol window, for instance clicking in the structure window doesn't highlight a position in the sequence window, so one has to search by residue numbers.
- If I submit a PDB coordinates file containing multiple chains, which chain will get traced?
- In my search UET is not working well on the short peptide chains (<30aa).
- Issues detected when I change the database to Swissprot from the default options in the Advanced Options section. Sometimes my searches got terminated, is there a bug?
- Providing the input by PDB id and its chain (input by 1st option) yields different ranking in comparison to uploading the coordinates of the same PDB file and chain by the second option.
- It will be nice if the authors provide few details on the database such as which protein databank version has been used for precomputed analysis and how frequently it gets updated
General FAQ
-
Where do I begin if I want to get an Evolutionary Trace Analysis of a protein structure?
Visit http://mammoth.bcm.tmc.edu/uet. You can access pre-computed ET results of structures from the protein data bank, and even run traces on custom structures.
The PyETV plugin for the PyMOL molecular visualization platform can also access the pre-computed ET results and directly display them in PyMOL.
ET results are also available at http://mammoth.bcm.tmc.edu/ETserver.html.
We demonstrate how to use these tools with videos viewable from our Youtube channel.
-
How do I obtain an Evolutionary Trace analysis of a protein sequence that may not have a crystal (PDB) structure?
If you know the UniProt ID or SwissProt accession number of your sequence, you may enter that ID into the ET Report Maker to get a human-readable document in PDF format, supplemented by the original data needed to reproduce the results quoted in the report
Alternatively, given the UniProt accession number or amino acid sequence in FASTA format, an Evolutionary Trace analysis request may be submitted through the web service: http://mammoth.bcm.tmc.edu/uet. The user will have the option whether to provide their own multiple sequence alignment and modify trace parameters through the service, or simply accept the default settings.
You can visit our Youtube channel for a demonstration and to learn more.
-
I want to start an Evolutionary Trace analysis using my own multiple sequence alignment. Should the query sequence (the sequence onto which
the ET ranks are to be mapped) be included in the alignment?
The query sequence must be included in the multiple sequence alignment. For example, if you provide a PDB structure, the sequence in the structure must match one of the sequences in the alignment. If you are using UET, the name of the query sequence can be specified in the Advanced Options section, near the upload box for the multiple sequence alignment file.
UET FAQ
- How do I highlight a range of residues at once in the sequence window? Highlighting positions in the sequence window turns on the space representation of highlighted residues, but it only seems to work for single residues, not for ranges.
One can highlight multiple residues from the sequence window, just not by mouse-dragging and highlighting, which acts as a literal text highlighting on the HTML page.
To allow the user to explicitly specify and select a range of residue numbers, the sequence window includes boxes where starting and ending residue numbers can be entered. Clicking the adjacent button will toggle the selection of the range of residue numbers from start to end. The "Clear all" button clears the selections on both the sequence view and structure view (i.e. removes the spacefill display).
- There are no easy ways to communicate with the JSmol window, for instance clicking in the structure window doesn't highlight a position in the sequence window, so one has to search by residue numbers.
Residues on the structure window can be selected using mouse clicks. This capability can be turned on by opening the JSmol menu (by a right-mouse-click on the structure window, or 2-finger click on a Mac track-pad) and selecting "Set picking" → "Select group". However, there is no way to instantaneously communicate this action from within JSmol to the outside, e.g. the webpage where the structure window lives. This is currently a technical limitation.To highlight the target residue's position in the sequence window, the user can hover the mouse pointer over the residue of interest in the structure window, read off the residue number, and then enter this residue number into the selection box in the sequence window.A mouse-click over a residue in the structure window will now select the corresponding one-letter code in the sequence window, as well as display the residue in spacefill mode.
- If I submit a PDB coordinates file containing multiple chains, which chain will get traced?
The custom coordinates file in PDB format should contain only one chain, and if there are multiple chains in the file, the first chain is extracted and traced.
- In my search UET is not working well on the short peptide chains (<30aa).
The pipeline for producing pre-computed trace analysis excludes chains shorter than 15 aa, which is one reason that not many pre-computed trace results can be found for short peptide chains. Short peptide chains do require special care to trace using non-default options, for example by changing the e-value threshold in the Advanced Options.
- Issues detected when I change the database to Swissprot from the default options in the Advanced Options section. Sometimes my searches got terminated, is there a bug?
The Swissprot database option ("Custom SwissProt reviewed") often returns very few sequence matches, or no hits, due to the small size of this database, which is more than 23 times smaller than the default "Custom UniRef90" database. We have revised UET to notify the user with a more informative message when the Swissprot option is unable to build a useful set of homologs. We will still make the Swissprot option available, since some users might be interested in building an alignment out of reviewed protein sequences.
- Providing the input by PDB id and its chain (input by 1st option) yields different ranking in comparison to uploading the coordinates of the same PDB file and chain by the second option.
The two traces were computed using different trace parameters. The sequence identity thresholds used to select homologs from BLAST search results might have been different (i.e. the parameter values for "Restrict BLAST hits to minimum sequence identity:" and "Restrict BLAST hits to maximum sequence identity:" could have been different. In the pre-computed traces, they could have been 20% and 95%, while in the de novo traces, they could have been 28% and 98%). The databases could have been different as well. For example, one trace could have used "Custom NCBI", while the other used "Custom UniRef90". The parameter settings can be found in the "log" file included in the zip file of trace results.
We are currently updating and retracing all chains in the database to use the same BLAST sequence database (e.g. UniRef90).
- It will be nice if the authors provide few details on the database such as which protein databank version has been used for precomputed analysis and how frequently it gets updated
Once a week, the PDB files are updated through rsync and new PDB structures (those of type "prot" or "prot-nuc") are subject to evolutionary trace analysis. Once a month, the sequence databases are updated (Custom NCBI/NR, UniProt/Uniref). The total number of available trace results in the database and the time stamp are now displayed on the input page for PDB IDs.
Currently, Trace results for 139958 PDB chains are available. This information was last updated on Nov 19 18:33