From Phaserwiki
Revision as of 10:16, 9 July 2009 by WikiSysop (talk | contribs) (Protected "FAQ" ([edit=sysop] (indefinite) [move=sysop] (indefinite)) [cascading])
What does Phaser do?
Phaser is a program for phasing macromolecular crystal structures with maximum likelihood methods. It currently has methods for brute force and fast likelihood-based rotation and translation functions for molecular replacement. Methods for experimental phasing are under development.
Where do I report bugs and make suggestions?
Email the Phaser development team at cimr-phaser@lists.cam.ac.uk.
If I manage to solve my structure by molecular replacement using Phaser, what paper should I cite?
J. Appl. Cryst. (2007). 40, 658-674. Phaser crystallographic software. A. J. McCoy, R. W. Grosse-Kunstleve, P. D. Adams, M. D. Winn, L.C. Storoni and R.J. Read.
I registered some time ago but my password hasn't arrived yet.
In order for your password to arrive, your email address has to be valid. If your password hasn't arrived within one hour, it is likely that you entered your email incorrectly. In that case, please register again.
If you are absolutely certain that you have entered your email address correctly and your password still hasn't arrived, please send a message to cimr-phaser@lists.cam.ac.uk and we will send out your password manually.
What can I do if Phaser fails to solve my structure with default parameters?
Try some of the suggestions for difficult cases.
What does LLG stand for?
LLG stands for Log Likelihood Gain. The likelihood is the probability that the data would have been measured, given the model, so it allows us to compare how well different models agree with the data. (In the case of molecular replacement, the model consists of the atomic coordinates plus rotation and/or translation operators applied to those coordinates.) The LLG is the difference between the likelihood of the model and the likelihood calculated from a Wilson distribution, so it measures how much better the data can be predicted with your model than with a random distribution of the same atoms. The LLG allows us to compare different models against the same data set, but the LLG values for different data sets should not be compared with each other. See our publications for more details.
What does it mean if the best LLG is negative?
This means that your model is worse than a collection of random atoms! The LLG should always be positive, and it should increase as the solution progresses. If the LLG is negative (or if it decreases, say, when carrying out a translation search after a rotation search), that tells you that you are being too optimistic about how well your model can predict the data. Check whether you might have underestimated the content of the asymmetric unit, so that your model is less complete than you thought it was. If you're certain about the content, then your molecular replacement model is not as accurate as Phaser was assuming. Look at the model (or better yet a series of related models) to see whether there might be any domain movements. If there are, either search with separate domains or with a series of models with different hinge angles. If hinge movements don't explain the problem, try increasing the RMS error that Phaser has estimated from the sequence identity. If none of this works, think about other possibilities: do you have the wrong space group? have you crystallised the wrong protein?
I get the error "CHECK INPUT Molecular weight of x.pdb (x) deviates more than 10% from the mean (x). This pdb file may contain domains or water not present in the other files."
The models in an ensemble must be very similar for the ensembling procedure to make sense. If you have extra loops or domains in one or more of the pdb files in an ensemble, Phaser detects this by using the variation in the molecular weight and exits. You will need to look at your pdb files and edit them appropriately.
My input space group, as specified in the mtz-file, is X. The space group reported in phaser is Y.
The important thing for Phaser is the set of symmetry operators in the MTZ header, which defines the space group uniquely, not the name of the space group recorded in the header, which can be ambiguous if people use non-standard settings. Phaser uses Ralf Grosse-Kunstleve's cctbx library to look up the space group from the symmetry operations, and then reports the space group name from cctbx.
Phaser runs out of memory. What can I do?
  1. Large problems consume a lot of memory, and we sometimes need more than 256Mb.
  2. The easiest way to reduce memory requirements is to reduce the resolution. By default Phaser uses data to 2.5Å resolution, but we've only had a very small number of cases that could be solved at 2.5Å but not at 3Å. So the resolution can probably be reduced in typical cases to 3Å, which will reduce many arrays by a factor of (2.5/3)3, i.e. reduced by nearly half, if you did have data to 2.5Å. If your molecule is very large, you can probably reduce the resolution even further, as the strength of the signal will depend to a great extent on the number of structure factors accepted.
  3. Phaser interpolates structure factors from a finely-sampled molecular transform--the sampling can be reduced by changing the BOXSCALE parameter from its default of 4 to something greater than 2.4.
  4. If it fails during the fast rotation function, the requirements there can be reduced by changing CLMN SPHERE from its default of twice the geometric mean radius of the model to something less.
  5. You might also want to check that the model contains only the atoms you want, and, if there are several models in an ensemble, that the models are properly superimposed.
  6. Phaser uses memory to keep track of potential partial solutions in the search tree, so you should not allow the search to become too large. In particular, we do not recommend increasing the allowed number of clashes to more than a very small number. Clashes arise in correct solutions only because of unconserved surface loops, so it is much better to trim such loops out from the model. As well, it is not a good idea to relax the criteria to save potential solutions too much from the defaults. For instance, for a difficult problem you may wish to keep rotations above 65% of the maximum (instead of the default of 75%), but reducing the threshold to 50% will increase the size of the search dramatically without giving much improvement in the chance of finding a solution.
  7. If you wish to ensure that Phaser's memory usage does not grow to the extent that other processes are hindered, you may be able to limit its memory usage, at least under some flavours of Unix. We have not tested this ourselves, but are told that you can use the "limit" command in the csh or tcsh shells, or the "ulimit" command in the bash shell.
How can I use electron density as a model?
The procedure to prepare structure factors representing a masked region of electron density is discussed here. You will need to remember both the extent of the mask and the location of its centre to supply as input to Phaser.
How can I obtain a "mixed model", as described by Schwarzenbacher et al. (Acta Cryst. D60 1229-1236, 2004)?
Use the FFAS server maintained by the Godzik lab. However, the procedure to get a "mixed model" with the original conformation of conserved side chains is not immediately obvious.
  1. Enter your sequence, choose the PDB database, click Search, and when the search is finished you'll be at the Results page.
  2. From that page, click on the link to the PDB database in the "Results vs" column.
  3. For each model of interest, make a note of the percent sequence identity (quite far to the right). You'll need this to give Phaser an idea of the expected RMS error of the model. Then click on the "scwrl" link under "Psi-Blast/align/model". This will open a page on the SCWRL server.
  4. Click the check box for "Retain original conformations of conserved residues", then click "Submit query".
  5. Finally, when the result from this appears, right-click "Get mixed model" and choose "Save Link As..." (or whatever the equivalent is on your browser, e.g. click-hold on a Mac) to save the PDB file with the mixed model.
I have installed Phaser both on its own and as part of Phenix. How do I make sure that the ccp4i interface finds the correct version of Phaser?
This shouldn't be a problem with recent versions of Phenix. To avoid this problem, recent versions of Phenix have a wrapper, phenix.phaser, which is used to run the Phenix version of Phaser. However, if you have other versions of Phaser on your system, you can use the information below to control which one is used by the ccp4i GUI.
By default, ccp4i uses the program name as a command, and leaves it to the $PATH variable (at least in Unix) to find the executable. If you have more than one executable with the same name, the one that comes first in your $PATH will be used. The best way to sort this out is by using a mechanism implemented in ccp4i for resolving name conflicts, by specifying unambiguously the location of the program that ccp4i should use.
  1. Start ccp4i.
  2. Choose "System Administration"->"Configure Interface".
  3. At the bottom of the section labelled "External Programs", under "Give full path name for CCP4 programs to overcome name conflicts" choose "Add a program".
  4. Enter "phaser" in the left box (for the program name) and the full path name you want ccp4i to use (e.g. /usr/local/bin/phaser) in the right box.
  5. Choose "Save"->"Save to installation file" if you have permission to change that file, or "Save"->"Save to user's home directory" if you do not.
I have installed Phaser and the ccp4i interface, but when I try to run Phaser from the interface I get the error message "couldn't execute "phaser"
no such file or directory". How can I fix this?
The problem is that the phaser executable is not in your path. You either need to put it in your path (which you do either by putting the phaser executable in a directory that is already in your path, or by changing the definition of the $PATH environment variable in your setup files, e.g. .cshrc or .bashrc), or you need to tell the ccp4i interface where to find it, as in the answer to the question just above.
I am using Phaser to solve the structure of a nucleic acid, but it fails to detect serious clashes in the packing. What is going wrong?
The most likely problem is that Phaser is not recognising the residues as belonging to a nucleic acid. The residue name should be right-justified in the residue name field of ATOM records in a PDB file. So whereas 3-letter residue names for amino acids are in columns 18-20, 1-letter residue names for nucleic acids should be in column 20. If they are in column 18 or 19, they will not be recognised, and the trace atoms to detect clashes will not be picked up. A second possibility is that the atom names differ from those expected for the trace atoms.
The recent remediation of the PDB files to version 3 has created a problem for older versions of Phaser. Versions of Phaser up to 2.1.4 expect the pre-remediation residue names (e.g. A for the adenine residue in RNA, +A for modified versions such as deoxy in DNA). In the version 3 remediation of the PDB, a "D" was added to the standard names for the deoxy residues, so a new PDB file would now have DA for the adenine residue of DNA. To work around this, you can rename the residues to use the old convention (e.g. A or +A instead of DA) and Phaser will find the trace atoms for the clash detection. Newer versions of Phaser will recognise the new nomenclature.


I get the error "No scattering atoms" (version 1.3) or "program assert (total Mw >0) failure" (version 1.2).
This error detects when no atoms are accepted by the program. Phaser throws away water molecules and atoms with zero occupancy. Please check the PDB file that you're using to make sure that it has non-water atoms with occupancy greater than zero.
I get the error "Negative F".
Only structure factor amplitudes greater than or equal to zero make physical sense, and Phaser assumes that their values will be positive. If your MTZ file contains negative amplitudes, Phaser interprets this as a serious problem that you should fix before continuing. Check that the column really contains amplitudes, and not intensities (negative net intensities are physically sensible measurements). Avoid using programs that use negative values to flag amplitudes that arise from negative intensities; instead, convert intensities to amplitudes using a program like Truncate.