Difference between revisions of "FAQ"

From Phaserwiki
(Add suggestion to cut final refinement resolution to reduce memory usage if necessary.)
(Mention st9bad_alloc so that people with out of memory problems will find this web page.)
Line 40: Line 40:
 
:The recent remediation of the PDB files to version 3 has created a problem for older versions of Phaser. Versions of Phaser up to 2.1.4 expect the pre-remediation residue names (e.g. A for the adenine residue in RNA, +A for modified versions such as deoxy in DNA). In the version 3 remediation of the PDB, a "D" was added to the standard names for the deoxy residues, so a new PDB file would now have DA for the adenine residue of DNA. To work around this, you can rename the residues to use the old convention (e.g. A or +A instead of DA) and Phaser will find the trace atoms for the clash detection. Newer versions of Phaser will recognise the new nomenclature.
 
:The recent remediation of the PDB files to version 3 has created a problem for older versions of Phaser. Versions of Phaser up to 2.1.4 expect the pre-remediation residue names (e.g. A for the adenine residue in RNA, +A for modified versions such as deoxy in DNA). In the version 3 remediation of the PDB, a "D" was added to the standard names for the deoxy residues, so a new PDB file would now have DA for the adenine residue of DNA. To work around this, you can rename the residues to use the old convention (e.g. A or +A instead of DA) and Phaser will find the trace atoms for the clash detection. Newer versions of Phaser will recognise the new nomenclature.
  
;Phaser runs out of memory. What can I do?
+
;Phaser runs out of memory (e.g. St9bad_alloc error). What can I do?
 
# Large problems consume a lot of memory, with some of our test cases working best if you have several gigabytes available.
 
# Large problems consume a lot of memory, with some of our test cases working best if you have several gigabytes available.
 
# Phaser interpolates structure factors from a finely-sampled molecular transform--the sampling can be reduced by changing the BOXSCALE parameter from its default of 4 to something greater than 2.4.
 
# Phaser interpolates structure factors from a finely-sampled molecular transform--the sampling can be reduced by changing the BOXSCALE parameter from its default of 4 to something greater than 2.4.

Revision as of 10:54, 4 December 2013

If I manage to solve my structure using Phaser, what paper should I cite?
J. Appl. Cryst. (2007). 40, 658-674. Phaser crystallographic software. A. J. McCoy, R. W. Grosse-Kunstleve, P. D. Adams, M. D. Winn, L.C. Storoni and R.J. Read.
What can I do if Phaser fails to solve my structure with default parameters?
Try some of the suggestions for difficult cases.
Where do I report bugs and make suggestions?
If you come across a bug, check the known bug list and if the problem is new please report it to us.
We also welcome all suggestions and details of problems and issues with difficult cases.
What does LLG stand for?
LLG stands for Log Likelihood Gain. The likelihood is the probability that the data would have been measured, given the model, so it allows us to compare how well different models agree with the data. (In the case of molecular replacement, the model consists of the atomic coordinates plus rotation and/or translation operators applied to those coordinates.) The LLG is the difference between the likelihood of the model and the likelihood calculated from a Wilson distribution, so it measures how much better the data can be predicted with your model than with a random distribution of the same atoms. The LLG allows us to compare different models against the same data set, but the LLG values for different data sets should not be compared with each other. See our publications for more details.
What does it mean if the best LLG is negative, or if it drops when I add a new component?
This means that your model is worse than a collection of random atoms! The LLG should always be positive, and it should increase as the solution progresses. If the LLG is negative (or if it decreases, say, when carrying out a translation search after a rotation search), that tells you that you are being too optimistic about how well your model can predict the data. Check whether you might have underestimated the content of the asymmetric unit, so that your model is less complete than you thought it was. If you're certain about the content, then your molecular replacement model is not as accurate as Phaser was assuming. Look at the model (or better yet a series of related models) to see whether there might be any domain movements. If there are, either search with separate domains or with a series of models with different hinge angles. If hinge movements don't explain the problem, try increasing the RMS error that Phaser has estimated from the sequence identity. If none of this works, think about other possibilities: do you have the wrong space group? Have you crystallised the wrong protein?
In recent versions of Phaser, the estimated RMSD (VRMS) for each component is refined, so the final LLG should rarely be negative. However, you may still see negative LLG values (or significant drops from the partial solution lacking the component being placed) immediately after translation searches before the refinement step.
What is the VRMS?
The refined VRMS (variance-RMS) values for the models in an ensemble are the RMS values that give the optimal LLG. The RMS affects the LLG value through its contribution to the variance, hence the term VRMS. VRMS is not necessarily the same as the RMS between the coordinates of the models and target as calculated from the structures (after target structure solution). The VRMS value is initially estimated from the sequence identity but can change significantly during refinement, even by 1Å.
How can I use electron density as a model?
The procedure to prepare structure factors representing a masked region of electron density is discussed here. You will need to remember both the extent of the mask and the location of its centre to supply as input to Phaser.
My input space group, as specified in the mtz-file, is X. The space group reported in phaser is Y.
The important thing for Phaser is the set of symmetry operators in the MTZ header, which defines the space group uniquely, not the name of the space group recorded in the header, which can be ambiguous if people use non-standard settings. Phaser uses Ralf Grosse-Kunstleve's cctbx library to look up the space group from the symmetry operations, and then reports the space group name from cctbx.
I have installed a recent version of Phaser with Phenix. How do I make CCP4i use the Phenix version of Phaser?
Phenix wraps phaser as phenix.phaser.
By default, ccp4i uses the program name as a command, and leaves it to the $PATH variable (at least in Unix) to find the executable.
To change the version of phaser called by CCP4i from the default CCP4 executable (phaser) to the Phenix executable (called with phenix.phaser)
  1. Start ccp4i.
  2. Choose "System Administration"->"Configure Interface".
  3. At the bottom of the section labelled "External Programs", under "Give full path name for CCP4 programs to overcome name conflicts" choose "Add a program".
  4. Enter "phaser" in the left box (for the program name) and the name you want ccp4i to use (phenix.phaser) in the right box.
  5. Choose "Save"->"Save to installation file" if you have permission to change that file, or "Save"->"Save to user's home directory" if you do not.
Try to remember to undo this when you install a new version of CCP4!
I am using Phaser to solve the structure of a nucleic acid, but it fails to detect serious clashes in the packing. What is going wrong?
The most likely problem is that Phaser is not recognising the residues as belonging to a nucleic acid. The residue name should be right-justified in the residue name field of ATOM records in a PDB file. So whereas 3-letter residue names for amino acids are in columns 18-20, 1-letter residue names for nucleic acids should be in column 20. If they are in column 18 or 19, they will not be recognised, and the trace atoms to detect clashes will not be picked up. A second possibility is that the atom names differ from those expected for the trace atoms.
The recent remediation of the PDB files to version 3 has created a problem for older versions of Phaser. Versions of Phaser up to 2.1.4 expect the pre-remediation residue names (e.g. A for the adenine residue in RNA, +A for modified versions such as deoxy in DNA). In the version 3 remediation of the PDB, a "D" was added to the standard names for the deoxy residues, so a new PDB file would now have DA for the adenine residue of DNA. To work around this, you can rename the residues to use the old convention (e.g. A or +A instead of DA) and Phaser will find the trace atoms for the clash detection. Newer versions of Phaser will recognise the new nomenclature.
Phaser runs out of memory (e.g. St9bad_alloc error). What can I do?
  1. Large problems consume a lot of memory, with some of our test cases working best if you have several gigabytes available.
  2. Phaser interpolates structure factors from a finely-sampled molecular transform--the sampling can be reduced by changing the BOXSCALE parameter from its default of 4 to something greater than 2.4.
  3. If it fails during the fast rotation function, the requirements there can be reduced by changing CLMN SPHERE from its default of twice the geometric mean radius of the model to something less.
  4. If it fails during the final refinement at the full resolution of the data set, you can limit the resolution of data used in this step with the RESOLUTION AUTO HIGH command.
  5. You might also want to check that the model contains only the atoms you want, and, if there are several models in an ensemble, that the models are properly superimposed.
  6. Phaser uses memory to keep track of potential partial solutions in the search tree, so you should not allow the search to become too large. In particular, we do not recommend increasing the allowed number of clashes much beyond the default. Clashes arise in correct solutions only because of unconserved surface loops, so it is much better to trim such loops out from the model. As well, it is not a good idea to relax the criteria to save potential solutions too much from the defaults. For instance, for a difficult problem you may wish to keep rotations above 65% of the maximum (instead of the default of 75%), but reducing the threshold to 50% will increase the size of the search dramatically without giving much improvement in the chance of finding a solution.
  7. In the past we recommended cutting the resolution during the search, but recent versions of Phaser automatically make a sensible choice of resolution cutoff to limit memory and CPU usage to what is expected to be needed to solve a problem. In fact, if the resolution is limited to less than needed to give a clear signal, the size of the search tree may increase so much that the ultimate memory and CPU requirements are even increased!
  8. If you wish to ensure that Phaser's memory usage does not grow to the extent that other processes are hindered, you may be able to limit its memory usage, at least under some flavours of Unix. We have not tested this ourselves, but are told that you can use the "limit" command in the csh or tcsh shells, or the "ulimit" command in the bash shell.