Difference between revisions of "Keyword Example Scripts"
(→Building an Ensemble from Coordinates: Remove reference to non-existent table, and mention using Ensembler to trim NMR ensembles.) |
(Clarify how to run keyword scripts from Windows.) |
||
(One intermediate revision by the same user not shown) | |||
Line 4: | Line 4: | ||
;$prompt> phaser | ;$prompt> phaser | ||
+ | |||
+ | (or "phenix.phaser" to get the version installed with Phenix), either in a Unix terminal (Linux, MacOSX) or Windows command line. Note that for Windows, you can set up the environment and open a command prompt by clicking "Phenix command prompt" in the Phenix group from the Start menu. | ||
After printing the Phaser Banner, the Preprocessor will ask: | After printing the Phaser Banner, the Preprocessor will ask: | ||
Line 21: | Line 23: | ||
*START | *START | ||
− | Alternatively, phaser can be run from command scripts as below | + | Alternatively, phaser can be run from command scripts as below. Note that, for Unix systems, when you have a full script in a file, starting with the command "phaser" (CCP4 version) or "phenix.phaser" (Phenix version), you can run the script with the "source" command, <i>e.g.</i> |
+ | |||
+ | ;$prompt> source myscript.com | ||
+ | |||
+ | if your script is in a file named "myscript.com". Alternatively, for both Unix and Windows environments, you can make a script containing just the Phaser keyword commands and use redirection to run Phaser, <i>i.e.</i> | ||
+ | |||
+ | ;$prompt> phaser < myscript.txt > myscript.log | ||
+ | |||
+ | or | ||
+ | ;$prompt> phenix.phaser < myscript.txt > myscript.log | ||
==How to Define Models== | ==How to Define Models== |
Latest revision as of 13:10, 19 February 2018
Phaser can be run from the command line
- $prompt> phaser
(or "phenix.phaser" to get the version installed with Phenix), either in a Unix terminal (Linux, MacOSX) or Windows command line. Note that for Windows, you can set up the environment and open a command prompt by clicking "Phenix command prompt" in the Phenix group from the Start menu.
After printing the Phaser Banner, the Preprocessor will ask:
- ENTER KEYWORD INPUT FROM FILE OR FROM STANDARD INPUT
Enter the keyword input
At the end of the input, start Phaser with one of the commands
- END
- QUIT
- STOP
- KILL
- EXIT
- GO
- RUN
- START
Alternatively, phaser can be run from command scripts as below. Note that, for Unix systems, when you have a full script in a file, starting with the command "phaser" (CCP4 version) or "phenix.phaser" (Phenix version), you can run the script with the "source" command, e.g.
- $prompt> source myscript.com
if your script is in a file named "myscript.com". Alternatively, for both Unix and Windows environments, you can make a script containing just the Phaser keyword commands and use redirection to run Phaser, i.e.
- $prompt> phaser < myscript.txt > myscript.log
or
- $prompt> phenix.phaser < myscript.txt > myscript.log
How to Define Models
Building an Ensemble from Coordinates
You have one structure as a model with 44% sequence identity to the protein in the crystal.
ENSEmble mol1 PDB structure1.pdb IDENtity .44
You have three structures as models with 44%, 39% and 35% identity to the protein in the crystal.
ENSEmble mol2 PDB structure1.pdb IDENtity .44 PDB structure2.pdb IDENtity .39 PDB structure3.pdb IDENtity .35
You have an NMR Ensemble as a model. There is no need to split the coordinates in the pdb file provided that the models are separated by MODEL and ENDMDL cards. In this case the sequence identity is not a good indication of the rms deviation of the structural coordinates to the target structure. You should use the RMS option; several test cases have succeeded where the ID was close to 100% with an RMS value of about 1.5Å. Note that, for NMR ensembles in which not all of the structure is well-defined, it is important to trim off parts of the models that diverge significantly. This can be done easily in Ensembler, with the trim=True option.
ENSEmble mol3 PDB nmr.pdb RMS 1.5
Building an Ensemble from Electron Density
You have low resolution electron density of your model. This density has been cut out and converted to structure factors in a large cell.
ENSEmble mol1 HKLIn mol1.mtz F = Fmol1 P = Pmol1 EXTEnt 23 25 29 RMS 2.0 CENTre 4 3 30 PROTein MW 10241 NUCLeic MW 0
How to Define Composition
Composition by Molecular Weight
You have one protein (with MW 21022) in the asymmetric unit
COMPosition PROTein MW 21022
You have three copies of a protein (with MW 21022) in the asymmetric unit
COMPosition PROTein MW 21022 COMPosition PROTein MW 21022 COMPosition PROTein MW 21022
Another way of entering the same thing is
COMPosition PROTein MW 21022 NUMber 3
Yet another way of entering the same thing is
COMPosition PROTein MW 63066
You have two copies of a protein (with MW 21022), two copies of a protein (with MW 9843) and RNA with (MW 32004) in the asymmetric unit
COMPosition PROTein MW 21022 NUMber 2 COMPosition PROTein MW 9843 NUMber 2 COMPosition NUCLeic MW 32004
Composition by Sequence
You have one protein (with sequence in fasta format in the file prot1.seq) in the asymmetric unit
COMPosition PROTein SEQuence prot1.seq
You have three copies of a protein (with sequence in fasta format in the file prot1.seq) in the asymmetric unit
COMPosition PROTein SEQuence prot1.seq COMPosition PROTein SEQuence prot1.seq COMPosition PROTein SEQuence prot1.seq
Another way of entering the same thing is
COMPosition PROTein SEQuence prot1.seq NUMber 3
Yet another way of entering the same thing is to make a sequence file with all the amino acids concatenated together (prot1.seq3)
COMPosition PROTein SEQuence prot1.seq3
You have two copies of a protein (with sequence in fasta format in the file prot1.seq), two copies of a protein (with sequence in fasta format in the file prot2.seq) and RNA with (with sequence in fasta format in the file nucl1.seq) in the asymmetric unit
COMPosition PROTein SEQuence prot1.seq NUMber 2 COMPosition PROTein SEQuence prot2.seq NUMber 2 COMPosition NUCLeic SEQuence nucl1.seq
Composition by Percentage Scattering
Each copy of Ensemble mol1 gives 22% of the scattering
COMPosition ENSEmble mol1 FRACtional 0.22
Each copy of Ensemble mol2 gives 78% of the scattering
COMPosition ENSEmble mol2 FRACtional 0.78
How to Define Solutions
To include the files you should use the preprocessor command @
@ filename.sol @ filename.rlist
"sol" Files
One copy of mol1 with known orientation and position (fractional coordinates)
SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74
If the rotation function and translation function for mol1 were very clear, then there will only be one type of 6DIM solution for mol1. If the rotation and translation functions for mol2 were then not clear, there will be a series of possible 6DIM solutions for mol2.
SOLUtion SET SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74 SOLUtion 6DIM ENSEmble mol2 EULEr 5 183 230 FRACtional 0.71 0.54 0.81 SOLUtion SET SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74 SOLUtion 6DIM ENSEmble mol2 EULEr 51 93 75 FRACtional 0.08 0.57 0.25
"rlist" Files
THere are three trial orientations to search
SOLUtion TRIAl ENSEmble mol1 EULEr 17 20 32 SCORE 4.5 SOLUtion TRIAl ENSEmble mol1 EULEr 67 65 51 SCORE 4.4 SOLUtion TRIAl ENSEmble mol1 EULEr 67 112 81 SCORE 4.3
There are two possibilities for the position of the first molecule, and two orientations to search for the first and three for the second.
SOLUtion SET SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74 SOLUtion TRIAl ENSEmble mol1 EULEr 44 20 32 SCORE 5.8 SOLUtion TRIAl ENSEmble mol1 EULEr 67 65 51 SCORE 5.2 SOLUtion SET SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.13 0.55 0.76 SOLUtion TRIAl ENSEmble mol1 EULEr 83 9 180 SCORE 6.3 SOLUtion TRIAl ENSEmble mol1 EULEr 8 36 92 SCORE 4.2 SOLUtion TRIAl ENSEmble mol1 EULEr 48 87 10 SCORE 4.0
Fixed Partial Structure
If you have the coordinates of a partial solution with the pdb coordinates of the known structure in the correct orientation and position, then you can force Phaser to use these coordinates. Use this pdb file to define an ensemble (named "mol1" in this example). Then manually create a .sol file of the following form and include it in the Phaser command script with the @filename preprocessor command (or include it directly in the script)
SOLUtion SET SOLUtion 6DIM ENSEmble mol1 EULEr 0 0 0 FRACtional 0 0 0
Automated Molecular Replacement
Example command script for finding BETA and BLIP. This is the minimum input, using all defaults (except the ROOT filename).
#''beta_blip_auto.com'' phaser << eof TITLe beta blip automatic MODE MR_AUTO HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip SEARch ENSEmble beta NUM 1 SEARch ENSEmble blip NUM 1 ROOT beta_blip_auto # not the default eof
Non-Automated Molecular Replacement
In special circumstances, you may need to run the steps of a structure solution separately, to gain more control over the progress of the run or to use specialized features. This can be illustrated by breaking up the solution of the beta-lactamase:BLIP complex.
Here is a job to automatically find the beta-lactamase component, which we would expect to be easier to find than BLIP (AUTO_beta.com in the tutorial directory).
phaser << eof MODE MR_AUTO HKLIN beta_blip_P3221.mtz LABIN F = Fobs SIGF = Sigma ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1 SEARCH ENSEMBLE beta NUM 1 ROOT AUTO_beta eof
Compared to the fully automated job searching for both components, the only important difference is the removal of the second SEARCH command. We could have defined the ENSEMBLE for blip, but we aren't using it in this job so it isn't necessary. Note that both COMPOSITION commands are still needed so that Phaser knows the fraction of the structure specified by beta!
Now we can use the information from the beta-lactamase solution in carrying out a rotation search for the BLIP component.
phaser << eof MODE MR_FRF HKLIN beta_blip_P3221.mtz LABIN F = Fobs SIGF = Sigma ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1 SOLUTION 6DIM ENSEMBLE beta EULER 199.95 41.50 184.08 FRAC -0.4974 -0.1588 -0.2808 SEARCH ENSEMBLE blip ROOT ROT_blip_fixbeta eof
Note that the MODE is now MR_FRF (Fast Rotation Function). The SOLUTION 6DIM command gives information about the solution for beta that is contained in the output file AUTO_beta.sol from running AUTO_beta.com. Take a look at AUTO_beta.sol, if you ran that job. Notice that it specifies the space group (important if we had tested both possibilities, P3121 and P3221). The SOLU SET command can be used to separate different potential solutions, each of which can be used as the start of searches for further molecules, but in this case there is only one.
Instead of copying the information from AUTO_beta.sol, it is easier to just include it using the @ command. @ is a Phaser preprocessor command that allows you to read in external files and use the contents as if they were explicitly included in the script file. The script is ROT_blip_fixbeta.com in the tutorial directory.
phaser << eof MODE MR_FRF HKLIN beta_blip_P3221.mtz LABIN F = Fobs SIGF = Sigma ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1 @AUTO_beta.sol SEARCH ENSEMBLE blip ROOT ROT_blip_fixbeta eof
Look at the file ROT_blip_fixbeta.rlist produced by running this job ("source ROT_blip_fixbeta.com" in the tutorial directory). This file contains the rotation peaks (SOLU TRIAL commands) as well as the fixed beta-lactamase solution (SOLU 6DIM command). We can include this file in a job to run a translation search, still fixing the known beta-lactamase solution.
phaser << eof MODE MR_FTF HKLIN beta_blip_P3221.mtz LABIN F = Fobs SIGF = Sigma ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1 @ROT_blip_fixbeta.rlist ROOT TRA_blip_fixbeta eof
What has changed?
- The MODE is now MR_FTF (Molecular Replacement - Fast Translation Function) instead of MR_FRF
- The orientations from the rotation search have been included using the @ command
- The SEARCH keyword has disappeared
Ok, that's all there is to it, so run this script (TRA_blip_fixbeta.com) and see what output you get.
Fast Rotation Function
Example command script for fast rotation function to find the orientation of BETA.
#beta_frf.com phaser << eof TITLe beta FRF MODE MR_FRF HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip SEARCH ENSEmble beta ROOT beta_frf eof
Example command script for fast rotation function to find the orientation of BLIP knowing the position and orientation of BETA, with the position and orientation of BETA input from the command line.
#blip_frf_with_beta.com phaser << eof TITLe blip FRF with beta rotation and translation MODE MR_FRF HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq #beta COMPosition PROTein SEQuence blip.seq #blip SEARch ENSEmble blip SOLUtion 6DIM ENSEmble beta EULEr 201 41 184 FRACtional -0.49408 -0.15571 -0.28148 ROOT blip_frf_with_beta eof
Example command script for fast rotation function to find the orientation of BLIP knowing only the orientation of BETA, with the orientation of BETA input using the output solution file from the beta_frf.com job above.
#blip_frf_with_beta_rot.com phaser << eof TITLe blip FRF with beta R MODE MR_FRF HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip SEARch ENSEmble blip @beta_frf.sol # solution file output by phaser ROOT blip_frf_with_beta_rot eof
Brute Rotation Function
Example command script for brute rotation function to find the orientation of BETA
#beta_brf.com phaser << eof TITLe beta BRF MODE MR_FRF TARGET ROTATION BRUTE HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip SEARch ENSEmble beta ROOT beta_brf eof
Example command script for brute rotation function to find the optimal orientation of BETA in a restricted search range and on a fine grid around the position from the fast rotation search.
#beta_brf_around.com phaser << eof TITLe beta BRF fine sampling MODE MR_FRF TARGET ROTATION BRUTE HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip SEARch ENSEmble beta ROTAte AROUnd EULEr 201 41 184 RANGE 10 SAMPling ROTation 0.5 XYZOut ON # not the default ROOT beta_brf_around eof
Fast Translation Function
Example command script for finding the position of BETA after the rotation function has been run and the results output to the file beta_frf.rlist
#beta_ftf.com phaser << eof TITLe beta FTF MODE MR_FTF HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip @beta_frf.rlist ROOT beta_ftf eof
Example command script for finding the position of BLIP after the rotation function has been run and the results output to the file blip_frf_with_beta.rlist, which has the SOLUtion 6DIM keyword input for BETA and the SOLUtion TRIAL keyword input for the orientations to try for BLIP with the translation function.
#blip_ftf_with_beta.com phaser << eof TITLe beta FTF MODE MR_FTF HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip @blip_frf_with_beta.rlist ROOT blip_ftf_with_beta eof
Brute Translation Function
Example command script for brute Translation function to find the position of BETA after the rotation function has been run
#beta_btf.com phaser << eof TITLe beta BTF MODE MR_FTF TARGET TRANSLATION BRUTE HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip @beta_frf.rlist TRANslate AROUnd FRACtional POINt -0.49408 -0.15571 -0.28148 RANGe 5 ROOT beta_btf eof
Refinement and Phasing
Example command script to refine a set of solutions
#beta_blip_rnp.com phaser << eof TITLe beta blip rigid body refinement MODE MR_RNP HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip ROOT beta_blip_rnp # not the default HKLOut OFF # not the default XYZOut OFF # not the default @beta_blip_auto.sol eof
Log-Likelihood Gain
Example command script to rescore the solutions using a different resolution range of data and a different spacegroup
#beta_blip_llg.com phaser << eof TITLe beta blip solution 6A P3121 MODE MR_LLG HKLIn beta_blip.mtz LABIn F=F SIGF = SIGF ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip ROOT beta_blip_llg # not the default RESOlution 6.0 SPACegroup P 31 2 1 @beta_blip_auto.sol eof
Packing
Example command script for determining whether a set of molecular replacement solutions pack in the unit cell.
#beta_blip_pak.com phaser << eof TITLe beta blip packing check MODE MR_PAK HKLIn beta_blip.mtz LABIn F=F SIGF=SIGF ENSEmble beta PDB beta.pdb IDENtity 100 ENSEmble blip PDB blip.pdb IDENtity 100 COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip ROOT beta_blip_pak # not the default @beta_blip_auto.sol eof
Automated Experimental Phasing
Do SAD phasing of insulin. This is the minimum input, using all defaults (except the ROOT filename and specifying the wavelength explicitly).
#insulin_auto.com phaser << eof MODE EP_AUTO TITLe sad phasing of insulin with intrinsic sulphurs HKLIn S-insulin.mtz COMPosition PROTein SEQ S-insulin.seq CRYStal insulin DATAset sad LABIn F+=F(+) SIG+=SIGF(+) F-=F(-) SIG-=SIGF(-) CRYStal insulin DATAset sad SCATtering CUKA # default: change if necessary LLGComplete CRYStal insulin COMPLETE ON SCATtering ELEMent S ATOM CRYStal insulin PDB S-insulin_hyss.pdb ROOT insulin_auto eof
Anisotropy Correction
Example command script to correct BETA-BLIP data for anisotropy
#beta_blip_ano.com phaser << eof MODE ANO TITLe beta blip data correction HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma ROOT beta_blip_ano # not the default eof
Cell Content Analysis
Example script for cell content analysis for BETA-BLIP
#beta_blip_cca.com phaser << eof TITLe BETA-BLIP cell content analysis MODE CCA HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip ROOT beta_blip_cca # not the default eof
Translational NCS Analysis
Example script for translational NCS analysis for BETA-BLIP
#beta_blip_ncs.com phaser << eof TITLe BETA-BLIP translational NCS analysis MODE NCS HKLIn beta_blip.mtz LABIn F=Fobs SIGF=Sigma COMPosition PROTein SEQuence beta.seq NUM 1 #beta COMPosition PROTein SEQuence blip.seq NUM 1 #blip ROOT beta_blip_ncs # not the default eof
Normal Mode Analysis
Do normal mode analysis, write out eigenfile and coordinates perturbed by default movements along mode 7 only. (If you only wanted to prepare the eigenfile but not coordinates, you could include the command "XYZOut OFF").
#beta_nma.com phaser << eof TITLe beta normal mode analysis MODE NMA ENSEmble beta PDB beta.pdb IDENtity 100 ROOT beta_nma # not the default eof
This example shows the use of several infrequently used options. Read in previous eigenfile and write out pdb files perturbed in 0.5 Ångstrom rms intervals in "forward" (positive dq values) direction only along modes 7 and 10 (and combinations of 7 and 10), up to a maximum rms shift of 1.2Å. Normally you would want to perturb the structure in both directions, and modes 8 and 9 are more likely to be of interest than mode 10, but something like this might be useful as part of a larger, more exhaustive, exploration.
#beta_nma_pdb.com phaser << eof TITLe beta normal mode analysis pdb file generation MODE NMA ENSEmble beta PDB beta.pdb IDENtity 100 ROOT beta_nma_pdb # not the default EIGEn READ beta_nma.mat NMAPdb MODE 7 MODE 10 NMAPdb RMS STEP 0.5 NMAPdb RMS MAXRMS 1.2 NMAPDB RMS DIRECTION FORWARD eof