Python Interface

From Phaserwiki

As an alternative to keyword input, Phaser can be called directly from a python script. This is the way Phaser is called in Phenix and we encourage developers of other automation pipelines to use the python scripting too. In order to call Phaser in python you will need to have Phaser installed from source. See Source_Code#Building_Phaser_from_source

Input-Objects, Run-Jobs, and Results-Objects

Using Phaser through the python interface is similar to using Phaser through the keyword interface. Each mode of operation of Phaser described above is controlled by an "input-object" (similar to the command script), has a Phaser "run-job" which runs the Phaser executable for the corresponding mode, and produces a "result-object" (which includes the logfile text). The user input is passed to the "input-object" with a calls to set- or add- functions. Phaser is then run with a call to the "run-job" function, which takes the "input-object" for control. Results are returned from the "result-object" with get-functions.

Functionality Input-Object Run-Job Results-Object
Anisotropy Correction i = InputANO() r = runANO(i) ResultANO()
Cell Content Analysis i = InputCCA() r = runCCA(i) ResultCCA()
Normal Mode Analysis i = InputNMA() r = runNMA(i) ResultNMA()
Translational NCS Analysis i = InputNCS() r = runNCS(i) ResultNCS()
Automated MR i = InputMR_AUTO() r = runMR_AUTO(i) ResultMR()
Rotation Function i = InputMR_FRF() r = runMR_FRF(i) ResultMR_RF()
Translation Function i = InputMR_FTF() r = runMR_FTF(i) ResultMR_TF()
Refinement and Phasing i = InputMR_RNP() r = runMR_RNP(i) ResultMR()
Log-Likelihood Gain i = InputMR_LLG() r = runMR_LLG(i) ResultMR()
Packing i = InputMR_PAK() r = runMR_PAK(i) ResultMR()
Automated Experimental Phasing i = InputEP_AUTO() r = runEP_AUTO(i) ResultEP()
SAD Experimental Phasing i = InputEP_SAD() r = runEP_SAD(i) ResultEP()

The major difference between running Phaser though the keyword interface and running Phaser though the python scripting is that the data reading and Phaser functionality are separated. For the Phaser "run-job" functions, the reflection data (for Miller indices, Fobs and SigmaFobs) are simply arrays, the space group is given as a Hall string, and the unitcell is given as an array of 6 numbers. This is an important feature of the Phaser python scripting as it means that the Phaser "run-job" functions are not tied to mtz file input, but the data can be read in python from any file format, and then the data passed to Phaser.

For the convenience of developers and users, the python scripting comes with data-reading jiffies to read data from mtz files. (These are the same mtz reading jiffies that are used internally by Phaser when calling Phaser from keyword input.)

Functionality Input-Object Run-Job Result-Object
Read Data for MR i = InputMR_DAT() r = runMR_DAT(i) ResultMR_DAT()
Read Data for EP i = InputEP_DAT() r = runEP_DAT(i) ResultEP_DAT()

Input-Object set- and add-Functions

The syntax of the set- and add- functions on the "input-objects" mirror the keyword input. Each "input-object" only has set- or add- functions corresponding to the keywords that are relevant for that mode. Attempting to set a value on an "input-object" that is irrelevant for that mode will result in an error. This differs from the keyword input, where the parser simply ignores any keywords that are not relevant to the current mode.

Note that setting the space group by name or number does not specify the setting. It is best to set the space group via the Hall symbol, which is unique to the full definition of the space group.

The python interface uses standard python and cctbx/scitbx variable types.

str          string
float        double precision floating point
Miller       cctbx::miller::index<int> 
dvect3       scitbx::vec3<float> 
dmat33       scitbx::mat3<float> 
type_array   scitbx::af::shared<type> arrays
     
Examples of keyword/python equivalences
Functionality Keyword Python Set Function
Set the root filename ROOT filename i.setROOT(filename)
Silence logfile output to standard output MUTE ON i.setMUTE(True)
Add a scattering type for llg map completion LLGC SCATTERING S i.addLLGC_SCAT(S)

Results-Object get-Functions

Data are extracted from the "result-objects" with get-functions. The get-functions are mostly specific to the type of "result-object" (described in sections below), but some are common to all "result-objects" (described in table below).

Ralf Grosse-Kunstleve's scitbx::af::shared<double> array type is heavily used for passing of arrays into the Phaser "input-objects" and extracting arrays from the Phaser "result-objects". This is a reference counted array type that can be used directly in python and in C++. It is part of the Phaser installation, when Phaser is installed from source. The scitbx (SCIentific ToolBoX) is part of the cctbx (Computational Crystallography ToolBoX) which is hosted by sourceforge

Functions common to all output objects
Results Objects Python Get Function
Exit status "success" r.Success()
Exit status "failure" r.Failure()
Type of Error (see error table). SYNTAX errors are not
thrown in python as they are generated by keyword input
r.ErrorName()
Message associated with error r.ErrorMessage()
Text of Summary r.summary()
Text of Logfile r.logfile()
Text of Verbose Logfile r.verbose()
Text of Warning messages r.warnings()
Text of Loggraph format tables/graphs r.loggraph()

There is no documentation for the functions available from each results object. Please see the file Outputs_bpl.cpp in the boost_python directory of the phaser source code distribution.

Error Handling

Exit status is indicated by Success() and Failure() functions of the "result-objects". Success indicates successful execution of Phaser, not that it has solved the structure! For molecular replacement jobs, the foundSolutions() function indicates that Phaser has found one or more potential solutions, the numSolutions() function returns how many solutions were found and the uniqueSolution() function returns True if only one solution was found. More detailed error information in the case of Failure is given by ErrorName() and ErrorMessage().

Advanced Information: All errors are thrown and caught internally by the "run-jobs", and so do not generate "Runtime Errors" in the python script. In particular "INPUT" errors are not thrown by the set- or add-functions of the "input-objects", but are stored in the "input-object" and passed to the "result-object" once the "run-job" is called. Results objects are derived from std::exception, and so can be thrown. Function what() returns ErrorName() (not the ErrorMessage()).

Logfile Handling

Writing of the logfile to standard output can be silenced with the i.setMUTE(True) function. The logfile or summary text can then be printed to standard output with the print r.logfile() or print r.summary() functions.

Advanced Information: Setting i.setMUTE(True) prevents real time viewing of the progress of a Phaser job. This may present an inconvenience for users. If you want to view the logfile information but not have it go to standard output, Logfile text can be redirected to a python string using an alternative call to the "run-job" function that includes passing an "output-object" (which controls the Phaser logging methods) on which the output stream has been set to a python string. This feature of Phaser was developed thanks to Ralf Grosse-Kunstleve.

Example Scripts

Copy and edit to start using Phaser