Difference between revisions of "Keyword Example Scripts"

From Phaserwiki
(Building an Ensemble from Coordinates: Remove reference to non-existent table, and mention using Ensembler to trim NMR ensembles.)
(Add instructions about sourcing scripts.)
Line 21: Line 21:
 
*START
 
*START
  
Alternatively, phaser can be run from command scripts as below
+
Alternatively, phaser can be run from command scripts as below. Note that when you have a full script in a file, starting with the command "phaser" (CCP4 version) or "phenix.phaser" (Phenix version), you can run the script with the "source" command, <i>e.g.</i>
 +
 
 +
;$prompt> source myscript.com
 +
 
 +
if your script is in a file named "myscript.com".
  
  

Revision as of 19:45, 16 February 2018

Phaser can be run from the command line

$prompt> phaser

After printing the Phaser Banner, the Preprocessor will ask:

ENTER KEYWORD INPUT FROM FILE OR FROM STANDARD INPUT

Enter the keyword input

At the end of the input, start Phaser with one of the commands

  • END
  • QUIT
  • STOP
  • KILL
  • EXIT
  • GO
  • RUN
  • START

Alternatively, phaser can be run from command scripts as below. Note that when you have a full script in a file, starting with the command "phaser" (CCP4 version) or "phenix.phaser" (Phenix version), you can run the script with the "source" command, e.g.

$prompt> source myscript.com

if your script is in a file named "myscript.com".


How to Define Models

Building an Ensemble from Coordinates

You have one structure as a model with 44% sequence identity to the protein in the crystal.

  ENSEmble mol1 PDB structure1.pdb IDENtity .44

You have three structures as models with 44%, 39% and 35% identity to the protein in the crystal.

  ENSEmble mol2   PDB structure1.pdb IDENtity .44 PDB structure2.pdb IDENtity .39 PDB structure3.pdb IDENtity .35

You have an NMR Ensemble as a model. There is no need to split the coordinates in the pdb file provided that the models are separated by MODEL and ENDMDL cards. In this case the sequence identity is not a good indication of the rms deviation of the structural coordinates to the target structure. You should use the RMS option; several test cases have succeeded where the ID was close to 100% with an RMS value of about 1.5Å. Note that, for NMR ensembles in which not all of the structure is well-defined, it is important to trim off parts of the models that diverge significantly. This can be done easily in Ensembler, with the trim=True option.

  ENSEmble mol3 PDB nmr.pdb RMS 1.5

Building an Ensemble from Electron Density

You have low resolution electron density of your model. This density has been cut out and converted to structure factors in a large cell.

  ENSEmble mol1  HKLIn mol1.mtz F = Fmol1 P = Pmol1 EXTEnt 23 25 29 RMS 2.0 CENTre 4 3 30 PROTein MW 10241 NUCLeic MW 0

How to Define Composition

Composition by Molecular Weight

You have one protein (with MW 21022) in the asymmetric unit

COMPosition PROTein MW 21022

You have three copies of a protein (with MW 21022) in the asymmetric unit

  COMPosition PROTein MW 21022
  COMPosition PROTein MW 21022
  COMPosition PROTein MW 21022

Another way of entering the same thing is

  COMPosition PROTein MW 21022 NUMber 3

Yet another way of entering the same thing is

  COMPosition PROTein MW 63066

You have two copies of a protein (with MW 21022), two copies of a protein (with MW 9843) and RNA with (MW 32004) in the asymmetric unit

  COMPosition PROTein MW 21022 NUMber 2
  COMPosition PROTein MW 9843 NUMber 2
  COMPosition NUCLeic MW 32004

Composition by Sequence

You have one protein (with sequence in fasta format in the file prot1.seq) in the asymmetric unit

  COMPosition PROTein SEQuence prot1.seq

You have three copies of a protein (with sequence in fasta format in the file prot1.seq) in the asymmetric unit

  COMPosition PROTein SEQuence prot1.seq
  COMPosition PROTein SEQuence prot1.seq
  COMPosition PROTein SEQuence prot1.seq

Another way of entering the same thing is

  COMPosition PROTein SEQuence prot1.seq NUMber 3

Yet another way of entering the same thing is to make a sequence file with all the amino acids concatenated together (prot1.seq3)

  COMPosition PROTein SEQuence prot1.seq3

You have two copies of a protein (with sequence in fasta format in the file prot1.seq), two copies of a protein (with sequence in fasta format in the file prot2.seq) and RNA with (with sequence in fasta format in the file nucl1.seq) in the asymmetric unit

  COMPosition PROTein SEQuence prot1.seq NUMber 2
  COMPosition PROTein SEQuence prot2.seq NUMber 2
  COMPosition NUCLeic SEQuence nucl1.seq

Composition by Percentage Scattering

Each copy of Ensemble mol1 gives 22% of the scattering

  COMPosition ENSEmble mol1 FRACtional 0.22

Each copy of Ensemble mol2 gives 78% of the scattering

  COMPosition ENSEmble mol2 FRACtional 0.78

How to Define Solutions

To include the files you should use the preprocessor command @

  @ filename.sol
  @ filename.rlist

"sol" Files

One copy of mol1 with known orientation and position (fractional coordinates)

  SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74

If the rotation function and translation function for mol1 were very clear, then there will only be one type of 6DIM solution for mol1. If the rotation and translation functions for mol2 were then not clear, there will be a series of possible 6DIM solutions for mol2.

  SOLUtion SET 
  SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74
  SOLUtion 6DIM ENSEmble mol2 EULEr 5 183 230 FRACtional 0.71 0.54 0.81 
  SOLUtion SET
  SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74
  SOLUtion 6DIM ENSEmble mol2 EULEr 51 93 75 FRACtional 0.08 0.57 0.25 

"rlist" Files

THere are three trial orientations to search

  SOLUtion TRIAl ENSEmble mol1 EULEr 17 20 32 SCORE 4.5
  SOLUtion TRIAl ENSEmble mol1 EULEr 67 65 51 SCORE 4.4
  SOLUtion TRIAl ENSEmble mol1 EULEr 67 112 81 SCORE 4.3

There are two possibilities for the position of the first molecule, and two orientations to search for the first and three for the second.

  SOLUtion SET
  SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.12 0.05 0.74
  SOLUtion TRIAl ENSEmble mol1 EULEr 44 20 32 SCORE 5.8
  SOLUtion TRIAl ENSEmble mol1 EULEr 67 65 51 SCORE 5.2
  SOLUtion SET
  SOLUtion 6DIM ENSEmble mol1 EULEr 17 20 32 FRACtional 0.13 0.55 0.76
  SOLUtion TRIAl ENSEmble mol1 EULEr 83 9 180 SCORE 6.3
  SOLUtion TRIAl ENSEmble mol1 EULEr 8 36 92 SCORE 4.2
  SOLUtion TRIAl ENSEmble mol1 EULEr 48 87 10 SCORE 4.0

Fixed Partial Structure

If you have the coordinates of a partial solution with the pdb coordinates of the known structure in the correct orientation and position, then you can force Phaser to use these coordinates. Use this pdb file to define an ensemble (named "mol1" in this example). Then manually create a .sol file of the following form and include it in the Phaser command script with the @filename preprocessor command (or include it directly in the script)

  SOLUtion SET 
  SOLUtion 6DIM ENSEmble mol1 EULEr 0 0 0 FRACtional 0 0 0

Automated Molecular Replacement

Example command script for finding BETA and BLIP. This is the minimum input, using all defaults (except the ROOT filename).

  #''beta_blip_auto.com''
  phaser << eof
  TITLe beta blip automatic
  MODE MR_AUTO
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  SEARch ENSEmble beta NUM 1
  SEARch ENSEmble blip NUM 1
  ROOT beta_blip_auto # not the default
  eof

Non-Automated Molecular Replacement

In special circumstances, you may need to run the steps of a structure solution separately, to gain more control over the progress of the run or to use specialized features. This can be illustrated by breaking up the solution of the beta-lactamase:BLIP complex.

Here is a job to automatically find the beta-lactamase component, which we would expect to be easier to find than BLIP (AUTO_beta.com in the tutorial directory).

 phaser << eof
 MODE MR_AUTO
 HKLIN beta_blip_P3221.mtz
 LABIN F = Fobs SIGF = Sigma
 ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0
 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1
 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1
 SEARCH ENSEMBLE beta NUM 1 
 ROOT AUTO_beta
 eof

Compared to the fully automated job searching for both components, the only important difference is the removal of the second SEARCH command. We could have defined the ENSEMBLE for blip, but we aren't using it in this job so it isn't necessary. Note that both COMPOSITION commands are still needed so that Phaser knows the fraction of the structure specified by beta!

Now we can use the information from the beta-lactamase solution in carrying out a rotation search for the BLIP component.

 phaser << eof
 MODE MR_FRF
 HKLIN beta_blip_P3221.mtz
 LABIN F = Fobs SIGF = Sigma
 ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0
 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0
 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1
 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1
 SOLUTION 6DIM ENSEMBLE beta EULER 199.95 41.50 184.08 FRAC -0.4974 -0.1588 -0.2808
 SEARCH ENSEMBLE blip
 ROOT ROT_blip_fixbeta
 eof

Note that the MODE is now MR_FRF (Fast Rotation Function). The SOLUTION 6DIM command gives information about the solution for beta that is contained in the output file AUTO_beta.sol from running AUTO_beta.com. Take a look at AUTO_beta.sol, if you ran that job. Notice that it specifies the space group (important if we had tested both possibilities, P3121 and P3221). The SOLU SET command can be used to separate different potential solutions, each of which can be used as the start of searches for further molecules, but in this case there is only one.

Instead of copying the information from AUTO_beta.sol, it is easier to just include it using the @ command. @ is a Phaser preprocessor command that allows you to read in external files and use the contents as if they were explicitly included in the script file. The script is ROT_blip_fixbeta.com in the tutorial directory.

 phaser << eof
 MODE MR_FRF
 HKLIN beta_blip_P3221.mtz
 LABIN F = Fobs SIGF = Sigma
 ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0
 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0
 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1
 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1
 @AUTO_beta.sol
 SEARCH ENSEMBLE blip
 ROOT ROT_blip_fixbeta
 eof

Look at the file ROT_blip_fixbeta.rlist produced by running this job ("source ROT_blip_fixbeta.com" in the tutorial directory). This file contains the rotation peaks (SOLU TRIAL commands) as well as the fixed beta-lactamase solution (SOLU 6DIM command). We can include this file in a job to run a translation search, still fixing the known beta-lactamase solution.

 phaser << eof
 MODE MR_FTF
 HKLIN beta_blip_P3221.mtz
 LABIN F = Fobs SIGF = Sigma
 ENSEMBLE beta PDBFILE beta.pdb IDENTITY 1.0
 ENSEMBLE blip PDBFILE blip.pdb IDENTITY 1.0
 COMPOSITION PROTEIN SEQUENCE beta.seq NUM 1
 COMPOSITION PROTEIN SEQUENCE blip.seq NUM 1
 @ROT_blip_fixbeta.rlist
 ROOT TRA_blip_fixbeta
 eof

What has changed?

  • The MODE is now MR_FTF (Molecular Replacement - Fast Translation Function) instead of MR_FRF
  • The orientations from the rotation search have been included using the @ command
  • The SEARCH keyword has disappeared

Ok, that's all there is to it, so run this script (TRA_blip_fixbeta.com) and see what output you get.

Fast Rotation Function

Example command script for fast rotation function to find the orientation of BETA.

  
  #beta_frf.com
  phaser << eof
  TITLe beta FRF
  MODE MR_FRF
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  SEARCH ENSEmble beta
  ROOT beta_frf
  eof

Example command script for fast rotation function to find the orientation of BLIP knowing the position and orientation of BETA, with the position and orientation of BETA input from the command line.

  #blip_frf_with_beta.com
  phaser << eof
  TITLe blip FRF with beta rotation and translation
  MODE MR_FRF
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq #beta
  COMPosition PROTein SEQuence blip.seq #blip
  SEARch ENSEmble blip
  SOLUtion 6DIM ENSEmble beta EULEr 201 41 184 FRACtional -0.49408 -0.15571 -0.28148
  ROOT blip_frf_with_beta
  eof

Example command script for fast rotation function to find the orientation of BLIP knowing only the orientation of BETA, with the orientation of BETA input using the output solution file from the beta_frf.com job above.

  #blip_frf_with_beta_rot.com
  phaser << eof
  TITLe blip FRF with beta R
  MODE MR_FRF
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  SEARch ENSEmble blip
  @beta_frf.sol # solution file output by phaser
  ROOT blip_frf_with_beta_rot
  eof

Brute Rotation Function

Example command script for brute rotation function to find the orientation of BETA

  #beta_brf.com
  phaser << eof
  TITLe beta BRF
  MODE MR_FRF
  TARGET ROTATION BRUTE
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  SEARch ENSEmble beta
  ROOT beta_brf
  eof

Example command script for brute rotation function to find the optimal orientation of BETA in a restricted search range and on a fine grid around the position from the fast rotation search.

  #beta_brf_around.com
  phaser << eof
  TITLe beta BRF fine sampling
  MODE MR_FRF
  TARGET ROTATION BRUTE
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  SEARch ENSEmble beta
  ROTAte AROUnd EULEr 201 41 184 RANGE 10
  SAMPling ROTation 0.5
  XYZOut ON # not the default
  ROOT beta_brf_around
  eof

Fast Translation Function

Example command script for finding the position of BETA after the rotation function has been run and the results output to the file beta_frf.rlist

  #beta_ftf.com
  phaser << eof
  TITLe beta FTF
  MODE MR_FTF
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  @beta_frf.rlist
  ROOT beta_ftf
  eof 

Example command script for finding the position of BLIP after the rotation function has been run and the results output to the file blip_frf_with_beta.rlist, which has the SOLUtion 6DIM keyword input for BETA and the SOLUtion TRIAL keyword input for the orientations to try for BLIP with the translation function.

  #blip_ftf_with_beta.com
  phaser << eof
  TITLe beta FTF
  MODE MR_FTF
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  @blip_frf_with_beta.rlist
  ROOT blip_ftf_with_beta
  eof

Brute Translation Function

Example command script for brute Translation function to find the position of BETA after the rotation function has been run

  
  #beta_btf.com
  phaser << eof
  TITLe beta BTF
  MODE MR_FTF
  TARGET TRANSLATION BRUTE
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  @beta_frf.rlist
  TRANslate AROUnd FRACtional POINt -0.49408 -0.15571 -0.28148 RANGe 5
  ROOT beta_btf
  eof 

Refinement and Phasing

Example command script to refine a set of solutions

  #beta_blip_rnp.com
  phaser << eof
  TITLe beta blip rigid body refinement
  MODE MR_RNP
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  ROOT beta_blip_rnp # not the default
  HKLOut OFF # not the default
  XYZOut OFF # not the default
  @beta_blip_auto.sol
  eof 

Log-Likelihood Gain

Example command script to rescore the solutions using a different resolution range of data and a different spacegroup

  #beta_blip_llg.com
  phaser << eof
  TITLe beta blip solution 6A P3121
  MODE MR_LLG
  HKLIn beta_blip.mtz
  LABIn F=F SIGF = SIGF
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  ROOT beta_blip_llg # not the default
  RESOlution 6.0
  SPACegroup P 31 2 1
  @beta_blip_auto.sol
  eof 

Packing

Example command script for determining whether a set of molecular replacement solutions pack in the unit cell.

  #beta_blip_pak.com
  phaser << eof
  TITLe beta blip packing check
  MODE MR_PAK
  HKLIn beta_blip.mtz
  LABIn F=F SIGF=SIGF
  ENSEmble beta PDB beta.pdb IDENtity 100
  ENSEmble blip PDB blip.pdb IDENtity 100
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  ROOT beta_blip_pak # not the default
  @beta_blip_auto.sol
  eof 

Automated Experimental Phasing

Do SAD phasing of insulin. This is the minimum input, using all defaults (except the ROOT filename and specifying the wavelength explicitly).

  #insulin_auto.com
  phaser << eof
  MODE EP_AUTO
  TITLe sad phasing of insulin with intrinsic sulphurs
  HKLIn S-insulin.mtz
  COMPosition PROTein SEQ S-insulin.seq
  CRYStal insulin DATAset sad LABIn F+=F(+) SIG+=SIGF(+) F-=F(-) SIG-=SIGF(-)
  CRYStal insulin DATAset sad SCATtering CUKA # default: change if necessary
  LLGComplete CRYStal insulin COMPLETE ON SCATtering ELEMent S
  ATOM CRYStal insulin PDB S-insulin_hyss.pdb
  ROOT insulin_auto
  eof

Anisotropy Correction

Example command script to correct BETA-BLIP data for anisotropy

 
  #beta_blip_ano.com
  phaser << eof
  MODE ANO
  TITLe beta blip data correction
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  ROOT beta_blip_ano # not the default
  eof 

Cell Content Analysis

Example script for cell content analysis for BETA-BLIP

  #beta_blip_cca.com
  phaser << eof
  TITLe BETA-BLIP cell content analysis
  MODE CCA
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  ROOT beta_blip_cca # not the default
  eof 

Translational NCS Analysis

Example script for translational NCS analysis for BETA-BLIP

  #beta_blip_ncs.com
  phaser << eof
  TITLe BETA-BLIP translational NCS analysis
  MODE NCS  
  HKLIn beta_blip.mtz
  LABIn F=Fobs SIGF=Sigma
  COMPosition PROTein SEQuence beta.seq NUM 1 #beta
  COMPosition PROTein SEQuence blip.seq NUM 1 #blip
  ROOT beta_blip_ncs # not the default
  eof 

Normal Mode Analysis

Do normal mode analysis, write out eigenfile and coordinates perturbed by default movements along mode 7 only. (If you only wanted to prepare the eigenfile but not coordinates, you could include the command "XYZOut OFF").

  
  #beta_nma.com
  phaser << eof
  TITLe beta normal mode analysis
  MODE NMA
  ENSEmble beta PDB beta.pdb IDENtity 100
  ROOT beta_nma # not the default
  eof 

This example shows the use of several infrequently used options. Read in previous eigenfile and write out pdb files perturbed in 0.5 Ångstrom rms intervals in "forward" (positive dq values) direction only along modes 7 and 10 (and combinations of 7 and 10), up to a maximum rms shift of 1.2Å. Normally you would want to perturb the structure in both directions, and modes 8 and 9 are more likely to be of interest than mode 10, but something like this might be useful as part of a larger, more exhaustive, exploration.

  #beta_nma_pdb.com
  phaser << eof
  TITLe beta normal mode analysis pdb file generation
  MODE NMA
  ENSEmble beta PDB beta.pdb IDENtity 100
  ROOT beta_nma_pdb # not the default
  EIGEn READ beta_nma.mat
  NMAPdb MODE 7 MODE 10 
  NMAPdb RMS STEP 0.5 
  NMAPdb RMS MAXRMS 1.2 
  NMAPDB RMS DIRECTION FORWARD
  eof