Difference between revisions of "Python Interface"

From Phaserwiki
(Functions relevant to the Run-Jobs)
Line 55: Line 55:
 
{| class="wikitable"  
 
{| class="wikitable"  
 
|-
 
|-
!Input Objects !!Python Set Function
+
! Keyword !!Python Set Function
 
|-
 
|-
|   ROOT filename ||i.setROOT(filename)
+
| ROOT filename ||i.setROOT(filename)
 
|-
 
|-
|     MUTE [ON|OFF] ||i.setMUTE(True|False)
+
| MUTE [ON|OFF] ||i.setMUTE(True|False)
 
|-
 
|-
|     TITLe title ||i.setTITL(title)
+
| TITLe title ||i.setTITL(title)
 
|-
 
|-
|     VERBose [ON|OFF] ||i.setVERB(True|False)
+
| VERBose [ON|OFF] ||i.setVERB(True|False)
|-
 
|    VERBose [ON|OFF] EXTRA ||i.setVERB_EXTRA(True|False)
 
|-
 
|    SPACegroup name ||i.setSPAC_NAME(name)
 
|-
 
|    SPACegroup number ||i.setSPAC_NUM(number)
 
|-
 
|    SPACegroup Hall ||i.setSPAC_HALL(hall)
 
|-
 
|    CELL a b c alpha beta gamm ||i.setCELL(a,b,c,alpha,beta,gamma)
 
|-
 
|    Cell set from array of 6 numbers ||i.setCELL([a,b,c,alpha,beta,gamma])
 
 
|}
 
|}
  
Line 100: Line 88:
 
|-
 
|-
 
|      Text of Verbose Logfile ||r.verbose()
 
|      Text of Verbose Logfile ||r.verbose()
|-
 
|      SpaceGroup Hall Symbol ||r.getSpaceGroupHall()
 
|-
 
|      SpaceGroup Name (Hermann Mauguin, edited for CCP4 compatibility <br>in R3 H3 R32 H32) ||r.getSpaceGroupName()
 
|-
 
|      SpaceGroup Number ||r.getSpaceGroupNumber()
 
|-
 
|      Number of symmetry operators ||r.getSpaceGroupNSYMM()
 
|-
 
|    Number of primative symmetry operators ||r.getSpaceGroupNSYMP()
 
|-
 
|      Symmetry operator #s, Rotation matrix element i,j (range 0-2) ||r.getSpaceGroupR(s,i,j)
 
|-
 
|      Symmetry operator #s, Translation vector element i (range 0-2) ||r.getSpaceGroupT(s,i)
 
|-
 
|      Unit Cell (array of 6 numbers) ||r.getUnitCell()
 
 
|}
 
|}
 
   
 
   
Line 128: Line 100:
  
 
Advanced Information: Setting i.setMUTE(True) prevents real time viewing of the progress of a Phaser job. This may present an inconvenience for users. If you want to view the logfile information but not have it go to standard output, Logfile text can be redirected to a python string using an alternative call to the "run-job" function that includes passing an "output-object" (which controls the Phaser logging methods) on which the output stream has been set to a python string. This feature of Phaser was developed thanks to Ralf Grosse-Kunstleve.
 
Advanced Information: Setting i.setMUTE(True) prevents real time viewing of the progress of a Phaser job. This may present an inconvenience for users. If you want to view the logfile information but not have it go to standard output, Logfile text can be redirected to a python string using an alternative call to the "run-job" function that includes passing an "output-object" (which controls the Phaser logging methods) on which the output stream has been set to a python string. This feature of Phaser was developed thanks to Ralf Grosse-Kunstleve.
 
==Automated Molecular Replacement==
 
 
 
{| class="wikitable"
 
|-
 
!      ResultMR !!Python Get Function
 
|-
 
|      Solutions were found (boolean) ||r.foundSolutions()
 
|-
 
|      Number of Solutions that were found (int) ||r.numSolutions()
 
|-
 
|      Only one solution found (boolean) ||r.uniqueSolution()
 
|-
 
|      LLG values for all solutions in decreasing order ||r.getValues()
 
|-
 
|      Script output file ||r.getSolFile()
 
|-
 
|      Xml output file ||r.getXmlFile()
 
|-
 
|      PDB files corresponding to solutions in decreasing LLG order ||r.getPdbFiles()
 
|-
 
|      MTZ files corresponding to solutions in decreasing LLG order ||r.getMtzFiles()
 
|-
 
|      PDB file name of top solution ||r.getTopPdbFile()
 
|-
 
|      MTZ file name of top solution ||r.getTopMtzFile()
 
|-
 
|      PDB file name number ''i'' ||r.getPdbFile(i)
 
|-
 
|      MTZ file name number ''i'' ||r.getMtzFile(i)
 
|-
 
|      All file names output ||r.getFilenames()
 
|-
 
|      Templates matching solution ''i'' (returns integer array)'' ||r.getTemplatesForSolution(i) ''
 
|- 
 
|      Solutions matching template ''i'' (returns integer array)'' ||r.getSolutionsForTemplate(i)
 
|-
 
|      Number of PDB files ||r.getNumPdbFiles()
 
|-
 
|      Number of MTZ files ||r.getNumMtzFiles()
 
|-
 
|      List of details of solutions (rotation, translation)<br>in decreasing LLG order (returns ''mr_solution'' type) ||r.getDotSol()
 
|-
 
|      Top solution set (returns ''mr_set'' type) ||r.getTopSet()
 
|-
 
|      Solution set number ''i'' (returns mr_set object) ||r.getSet(i)
 
|-
 
|}
 
 
==Reading MTZ Files for Experimental Phasing==
 
 
{| class="wikitable"
 
|-
 
!      ResultEP_DAT !!Python Get Function
 
|-
 
|      Miller Indices (array) ||r.getMiller()
 
|-
 
|      Non-anomalous F values for crystal "xtal" and dataset "wave" (array) ||r.getF(xtal,wave)
 
|-
 
|      Non-anomalous SIGF values for crystal "xtal" and dataset "wave" (array) ||r.getSIGF(xtal,wave)
 
|-
 
|      Boolean flags for F (and SIGF) present for crystal "xtal" and dataset "wave" (array) ||r.getP(xtal,wave)
 
|-
 
|      Anomalous F+ values for crystal "xtal" and dataset "wave" (array) ||r.getFpos(xtal,wave)
 
|-
 
|      Anomalous SIGF+ values for crystal "xtal" and dataset "wave" (array) ||r.getSIGFpos(xtal,wave)
 
|-
 
|      Boolean flags for F+ (and SIGF+) present for crystal "xtal" and dataset "wave" (array) ||r.getPpos(xtal,wave)
 
|-
 
|      Anomalous F- values for crystal "xtal" and dataset "wave" (array) ||r.getFneg(xtal,wave)
 
|-
 
|      Anomalous SIGF- values for crystal "xtal" and dataset "wave" (array) ||r.getSIGFneg(xtal,wave)
 
|-
 
|      Boolean flags for F- (and SIGF-) present for crystal "xtal" and dataset "wave" (array) ||r.getPneg(xtal,wave)
 
|}
 
 
==Automated Experimental Phasing==
 
 
{| class="wikitable"
 
|-
 
!      ResultEP !!Python Get Function
 
|-
 
|      Log-likelihood of refined solution ||r.getLogLikelihood()
 
|-
 
|      Miller Indices (array) ||r.getMiller()
 
|-
 
|      Boolean array flagging reflections included in electron denisty ||r.getSelected()
 
|-
 
|      Figures of merit for phased dataset (array) ||r.getFOM()
 
|-
 
|      Amplitudes for weighted electrion density of phased dataset (array) ||r.getFWT()
 
|-
 
|      Phases for weighted electrion density of phased dataset (array) ||r.getPHWT()
 
|-
 
|      Phases for electrion density of phased dataset (array) ||r.getPHIB()
 
|-
 
|      Amplitudes for log-likelihood gradient map ||r.getFLLG()
 
|-
 
|      Phases for log-likelihood gradient map ||r.getPHLLG()
 
|-
 
|      Atoms included in final solution for crystal xtal ||r.getAtoms(xtalid)
 
|-
 
|      Atoms rejected from final solution for crystal xtal ||r.getRejectedAtoms(xtalid)
 
|-
 
|      f' for atomtype "type" in crystal "xtald" dataset "wave" ||r.getFp(xtalid,wave,type)
 
|-
 
|      f" for atomtype "type" in crystal "xtald" dataset "wave" ||r.getFdp(xtalid,wave,type)
 
|-
 
|      Name of output MTZ file ||r.getMtzFile()
 
|-
 
|      Name of output PDB file ||r.getPdbFile()
 
|-
 
|      Name of output SOL file ||r.getSolFile()
 
|-
 
|      Name of output XML file ||r.getXmlFile()
 
|-
 
|    Overall low resolution limit ||r.stats_lores()
 
|-
 
|      Overall high resolution limit ||r.stats_hires()
 
|-
 
|      Overall figure of merit for all reflections ||r.stats_fom()
 
|-
 
|      Overall figure of merit for acentrics ||r.stats_acentric_fom()
 
|-
 
|      Overall figure of merit for centrics ||r.stats_centric_fom()
 
|-
 
|      Overall figure of merit for singleton ||r.stats_singleton_fom()
 
|-
 
|      Overall number of reflections ||r.stats_num()
 
|-
 
|      Overall number of acentric reflections ||r.stats_acentric_num()
 
|-
 
|      Overall number of centric reflections ||r.stats_centric_num()
 
|-
 
|      Overall number of singleton reflections ||r.stats_singleton_num()
 
|-
 
|      Number of resolution bins for statistics ||r.stats_numbins()
 
|}
 
 
==Anisotropy Correction==
 
If data LABIN are set (as e.g. Fobs and Sigma) column label in output MTZ file are Fobs_ISO and Sigma_ISO. If LABIN is not set on input object, default output is F_ISO and SIGF_ISO. 
 
 
{| class="wikitable"
 
|-
 
!ResultANO !!Python Get Function
 
|-
 
|      Miller Indices (array) ||r.getMiller()
 
|-
 
|      F values (array) ||r.getF()
 
|-
 
|      SIGF values (array) ||r.getSIGF()
 
|-
 
|      Corrected F (array) ||r.getCorrectedF()
 
|-
 
|      Corrected SIGF (array) ||r.getCorrectedSIGF()
 
|-
 
|      Correction Factor ||r.getCorrection()
 
|-
 
|      Apply scale and correction factors to array ||new_array = r.getScaledCorrected(array)
 
|-
 
|      Factor to put data on absolute scale ||r.WilsonK()
 
|-
 
|      Wilson B factor ||r.WilsonB()
 
|-
 
|      Measure of anisotropy ||r.getAnisoDeltaB()
 
|-
 
|      Eigenvalues of anisotropy ||r.getEigenBs()
 
|-
 
|      Eigenvectors and Eigenvalues or anisotropy ||r.getEigenSystem()
 
|-
 
|      Name of output MTZ file ||r.getMtzFile()
 
|-
 
|      Output MTZ file corrected F label ||r.getLaboutF()
 
|-
 
|      Output MTZ file corrected SIGF label ||r.getLaboutSIGF()
 
|-
 
|      Name of output XML file ||r.getXmlFile()
 
|}
 
 
==Cell Content Analysis==
 
 
{| class="wikitable"
 
|-   
 
! ResultCCA !!Python Get Function
 
|-
 
|      Molecular weight of the assembly used for VM calculations ||r.getAssemblyVM()
 
|-
 
|      Number of multiples of the assembly within allowed VM range ||r.getNum()
 
|-
 
|      Array of the multiples (Z) of the assembly within allowed VM range ||r.getZ()
 
|-
 
|      Array of the values of VM corresponding to the multiples (Z) of the assembly ||r.getVM()
 
|-
 
|      Array of the probabilities of VM corresponding to the multiples (Z) of the assembly ||r.getProb()
 
|-
 
|      Most probable multiple (Z) of the assembly ||r.getBestZ()
 
|-
 
|      VM of the most probable multiple (Z) of the assembly ||r.getBestVM()
 
|-
 
|      Probability of the most probable multiple (Z) of the assembly ||r.getBestProb()
 
|-
 
|      XML file name ||r.getXmlFile()
 
|-
 
|      Optimal VM for space group, unit cell and resolution ||r.getOptimalVM()
 
|-
 
|      Optimal MW for space group, unit cell and resolution ||r.getOptimalMW()
 
|}
 
 
==Normal Mode Analysis==
 
 
 
{| class="wikitable"
 
|-
 
!      ResultNMA !!Python Get Function
 
|-
 
|      Number of total perturbations along combinations of normal modes ||r.getNum()
 
|-
 
|    Array of all pdb files ||r.getPdbFiles()
 
|-
 
|      Name of pdb file for perturbation #i ||r.getPdbFile(i)
 
|-
 
|      Array of normal modes contributing to perturbation #i ||r.getModes(i)
 
|-
 
|      Array of displacements along modes contributing to perturbation #i ||r.getDisplacements(i)
 
|-
 
|      Script output file ||r.getSolFile()
 
|-
 
|      Xml output file ||r.getXmlFile()
 
|}
 

Revision as of 13:32, 15 May 2012

As an alternative to keyword input, Phaser can be called directly from a python script. This is the way Phaser is called in Phenix and we encourage developers of other automation pipelines to use the python scripting too. In order to call Phaser in python you will need to have Phaser installed from source.

Input-Objects, Run-Jobs, and Results-Objects

Using Phaser through the python interface is similar to using Phaser through the keyword interface. Each mode of operation of Phaser described above is controlled by an "input-object" (similar to the command script), has a Phaser "run-job" which runs the Phaser executable for the corresponding mode, and produces a "result-object" (which includes the logfile text). The user input is passed to the "input-object" with a calls to set- or add- functions. Phaser is then run with a call to the "run-job" function, which takes the "input-object" for control. Results are returned from the "result-object" with get-functions.

Functionality Input-Object Run-Job Results-Object
Anisotropy Correction i = InputANO() r = runANO(i) ResultANO()
Cell Content Analysis i = InputCCA() r = runCCA(i) ResultCCA()
Normal Mode Analysis i = InputNMA() r = runNMA(i) ResultNMA()
Automated MR i = InputMR_AUTO() r = runMR_AUTO(i) ResultMR()
Fast Rotation Function i = InputMR_FRF() r = runMR_FRF(i) ResultMR_RF()
Brute Rotation Function i = InputMR_BRF() r = runMR_BRF(i) ResultMR_RF()
Fast Translation Function i = InputMR_FTF() r = runMR_FTF(i) ResultMR_TF()
Brute Translation Function i = InputMR_BTF() r = runMR_BTF(i) ResultMR_TF()
Refinement and Phasing i = InputMR_RNP() r = runMR_RNP(i) ResultMR()
Log-Likelihood Gain i = InputMR_LLG() r = runMR_LLG(i) ResultMR()
Packing i = InputMR_PAK() r = runMR_PAK(i) ResultMR()
Automated Experimental Phasing i = InputEP_AUTO() r = runEP_AUTO(i) ResultEP()
SAD Experimental Phasing i = InputEP_SAD() r = runEP_SAD(i) ResultEP()

The major difference between running Phaser though the keyword interface and running Phaser though the python scripting is that the data reading and Phaser functionality are separated. For the Phaser "run-job" functions, the reflection data (for Miller indices, Fobs and SigmaFobs) are simply arrays, the space group is given as a Hall string, and the unitcell is given as an array of 6 numbers. This is an important feature of the Phaser python scripting as it means that the Phaser "run-job" functions are not tied to mtz file input, but the data can be read in python from any file format, and then the data passed to Phaser.

For the convenience of developers and users, the python scripting comes with data-reading jiffies to read data from mtz files. (These are the same mtz reading jiffies that are used internally by Phaser when calling Phaser from keyword input.)

Functionality Input-Object Run-Job Result-Object
Read Data for MR i = InputMR_DAT() r = runMR_DAT(i) ResultMR_DAT()
Read Data for EP i = InputEP_DAT() r = runEP_DAT(i) ResultEP_DAT()

Input-Object set- and add-Functions

The syntax of the set- and add- functions on the "input-objects" mirror the keyword input. Each "input-object" only has set- or add- functions corresponding to the keywords that are relevant for that mode. Attempting to set a value on an "input-object" that is irrelevant for that mode will result in an error. This differs from the keyword input, where the parser simply ignores any keywords that are not relevant to the current mode. Some functions are common to all input-objects (described in the table below).

Note that setting the space group by name or number does not specify the setting. It is best to set the space group via the Hall symbol, which is unique to the full definition of the space group.

Keyword Python Set Function
ROOT filename i.setROOT(filename)
MUTE [ON|OFF] i.setMUTE(True|False)
TITLe title i.setTITL(title)
VERBose [ON|OFF] i.setVERB(True|False)

Results-Object get-Functions

Data are extracted from the "result-objects" with get-functions. The get-functions are mostly specific to the type of "result-object" (described in sections below), but some are common to all "result-objects" (described in table below).

Ralf Grosse-Kunstleve's scitbx::af::shared<double> array type is heavily used for passing of arrays into the Phaser "input-objects" and extracting arrays from the Phaser "result-objects". This is a reference counted array type that can be used directly in python and in C++. It is part of the Phaser installation, when Phaser is installed from source. The scitbx (SCIentific ToolBoX) is part of the cctbx (Computational Crystallography ToolBoX) which is hosted by sourceforge

Results Objects Python Get Function
Exit status "success" r.Success()
Exit status "failure" r.Failure()
Type of Error (see error table). SYNTAX errors are not thrown in python
as they are generated by keyword input
r.ErrorName()
Message associated with error r.ErrorMessage()
Text of Summary r.summary()
Text of Logfile r.logfile()
Text of Verbose Logfile r.verbose()

Error Handling

Exit status is indicated by Success() and Failure() functions of the "result-objects". Success indicates successful execution of Phaser, not that it has solved the structure! For molecular replacement jobs, the foundSolutions() function indicates that Phaser has found one or more potential solutions, the numSolutions() function returns how many solutions were found and the uniqueSolution() function returns True if only one solution was found. More detailed error information in the case of Failure is given by ErrorName() and ErrorMessage().

Advanced Information: All errors are thrown and caught internally by the "run-jobs", and so do not generate "Runtime Errors" in the python script. In particular "INPUT" errors are not thrown by the set- or add-functions of the "input-objects", but are stored in the "input-object" and passed to the "result-object" once the "run-job" is called. Results objects are derived from std::exception, and so can be thrown. Function what() returns ErrorName() (not the ErrorMessage()).

Logfile Handling

Writing of the logfile to standard output can be silenced with the i.setMUTE(True) function. The logfile or summary text can then be printed to standard output with the print r.logfile() or print r.summary() functions.

Advanced Information: Setting i.setMUTE(True) prevents real time viewing of the progress of a Phaser job. This may present an inconvenience for users. If you want to view the logfile information but not have it go to standard output, Logfile text can be redirected to a python string using an alternative call to the "run-job" function that includes passing an "output-object" (which controls the Phaser logging methods) on which the output stream has been set to a python string. This feature of Phaser was developed thanks to Ralf Grosse-Kunstleve.