Python API for gmx_MMPBSA v1.5.x coming soon...¶
Python API for gmx_MMPBSA v1.4.3¶
The aim of the gmx_MMPBSA
API is to provide you with direct access to the raw data produced during a gmx_MMPBSA
calculation. By default, gmx_MMPBSA calculates an average, standard deviation, and standard error of the mean for all the generated data sets, but it does not support custom analyses. The API reads an _GMXMMPBSA_info file, from which it will determine what kind of calculation was performed, then automatically parse the output files and load the data into arrays.
Warning
The topology files you used in the gmx_MMPBSA
calculation must also be available in the location specified in the _GMXMMPBSA_info
file.
Using the API¶
We have derived a new API to reorganize the data so that it is arranged more hierarchically. This makes easier to transform the data into graphs in the gmx_MMPBSA_ana
. **The original and the current API only differ in the name of the callable function, the disposition of the data in Per-wise decomposition analysis and in the new 'delta' key.
The function load_gmxmmpbsa_info
takes the name of a gmx_MMPBSA
info file (typically _GMXMMPBSA_info
) and returns a populated mmpbsa_data
instance with all the parsed data. An example code snippet that creates a mmpbsa_data
instance from the information in _GMXMMPBSA_info
is shown below.
Important
Unlike MMPBSA.py, load_gmxmmpbsa_info
does not need to be located in the folder that contains the _GMXMMPBSA_info
file.
New in v1.4.0
from GMXMMPBSA import API as gmxMMPBSAapi
data = gmxMMPBSAapi.load_gmxmmpbsa_info("_GMXMMPBSA_info")
Properties of mmpbsa_data¶
The mmpbsa_data
class is a nested dictionary structure (mmpbsa_data
is actually derived from dict
). The various attributes of mmpbsa_data
are described below followed by the defined operators.
Attributes¶
If the numpy package is installed and available, all data arrays will be numpy.ndarray
instances. Otherwise, all data arrays will be array.array
instances with the ’d’ data type specifier (for a double precision float). The data is organized in an mmpbsa_data
instance in the following manner:
In this example, calc_key
is a dict
key that is paired to another dict
(mmpbsa_data_instance
is the first-level dict
, in this case) (Table 2). The keys of these second-level dict instances (system_component
) pair to another dict
(Table 3).
Table 2. List and description of calc_key
dict keys that may be present in instances of the mmpbsa_data
class.
Dictionary Key (calc_key) | Calculation Type |
---|---|
gb | Generalized Born Results |
pb | Poisson-Boltzmann Results |
rism gf | Gaussian Fluctuation 3D-RISM Results |
rism std | Standard 3D-RISM Results |
ie | Interaction Entropy Results |
nmode | Normal Mode Analysis Results |
qh | Quasi-harmonic Approximation Results |
Table 3. List and description of system_component keys that may be present in instances of the mmpbsa_data class.
Dictionary Key (system_component) | Description |
---|---|
complex | Data sets for the complex. (Stability & Binding) |
receptor | Data sets for the receptor. (Binding only) |
ligand | Data sets for the ligand. (Binding only) |
delta | Data sets for the delta. (Binding only) |
The keys of these inner-most (third-level) dict
instances are paired with the data arrays for that energy term (Table 4). The various dictionary keys are listed below for each level. If alanine scanning was performed, the mmpbsa_data_instance
also has a "mutant"
attribute that contains the same dictionary structure as mmpbsa_data
does for the normal system. If not, the mutant attribute is None. The only difference is that the data is accessed as follows:
Warning
All keys are case-sensitive, and if a space appears in the key, it must be present in your program. Also, if polar/non-polar decomposition is not performed for 3D-RISM
, then the POLAR SOLV
and APOLAR SOLV
keys are replaced with the single key ERISM
Table 4. List and description of energy_term keys that may be present in instances of the mmpbsa_data class. The allowed values of energy_term depend on the value of calc_key above in Table 2. The energy_term
keys are listed for each calc_key
enumerated above, accompanied by a description. The RISM keys are the same for both rism gf
and rism std
although the value of POLAR SOLV
and APOLAR SOLV
will differ depending on the method chosen. Those keys marked with * are specific to the CHARMM force field used through chamber. Those arrays are all 0 for normal Amber topology files.
Description | gb | pb | RISM |
---|---|---|---|
Bond energy | BOND | BOND | BOND |
Angle energy | ANGLE | ANGLE | ANGLE |
Dihedral Energy | DIHED | DIHED | DIHED |
Urey-Bradley* | UB | UB | — |
Improper Dihedrals* | IMP | IMP | — |
Correction Map* | CMAP | CMAP | — |
1-4 van der Waals energy | 1-4 VDW | 1-4 VDW | 1-4 VDW |
1-4 Electrostatic energy | 1-4 EEL | 1-4 EEL | 1-4 EEL |
van der Waals energy | VDWAALS | VDWAALS | VDWAALS |
Electrostatic energy | EEL | EEL | EEL |
Polar solvation energy | EGB | EPB | POLAR SOLV |
Non-polar solvation energy | ESURF | ENPOLAR | APOLAR SOLV |
Total solvation free energy | G solv | G solv | G solv |
Total gas phase free energy | G gas | G gas | G gas |
Total energy | TOTAL | TOTAL | TOTAL |
Table 5. Same as Table 4 for the entropy (nmode and qh) data.
Description | nmode | qh |
---|---|---|
Translational entropy | Translational | Translational |
Rotational entropy | Rotational | Rotational |
Vibrational entropy | Vibrational | Vibrational |
Total entropy | Total | Total |
Table 6. Same as Table 5 for the Interaction Entropy data.
Description | IE |
---|---|
Data per-frame | data |
Mean of the selected interval | value |
Star and End frames of the selected interval | frames |
Defined operators¶
In-place addition: It extends all the arrays that are common to both mmpbsa_data
instances. This is useful if, for instance, you run two gmx_MMPBSA
calculations, and you use -prefix _GMXMMPBSA2_
for the second gmx_MMPBSA
calculation, the following pseudo-code will generate a mmpbsa_data instance with all the data in concatenated arrays. The pseudo-code assumes GMXMMPBSA.API was imported as demonstrated below.
data = gmxMMPBSAapi.load_gmxmmpbsa_info("_GMXMMPBSA_info")
data += gmxMMPBSAapi.load_gmxmmpbsa_info("_GMXMMPBSA2_info")
Example API Usage¶
In many cases, the autocorrelation function of the energy can aid in the analysis of MM/PBSA data, since it provides a way of determining the statistical independence of your data points. For example, 1000 correlated snapshots provide less information, and therefore less statistical certainty, than 1000 uncorrelated snapshots. The standard error of the mean calculation performed by gmx_MMPBSA
assumes a completely uncorrelated set of snapshots, which means that it is a lower bound of the true standard error of the mean, and a plot of the autocorrelation function may help determine the actual value.
The example program below will calculate the autocorrelation function of the total energy (complex only for both the normal and alanine mutant systems) from a GB calculation and plot the resulting code using matplotlib.
import os
import sys
# append AMBERHOME/bin to sys.path
sys.path.append(os.path.join(os.getenv('AMBERHOME'), 'bin'))
# Now import the MMPBSA API
from GMXMMPBSA import API as gmxMMPBSAapi
import matplotlib.pyplot as plt
import numpy as np
data = gmxMMPBSAapi.load_gmxmmpbsa_info('_GMXMMPBSA_info')
total = data['gb']['complex']['TOTAL'].copy()
data = gmxMMPBSAapi.load_gmxmmpbsa_info('_GMXMMPBSA_info')
total_mut = data.mutant['gb']['complex']['TOTAL'].copy()
# Create a second copy of the data set. The np.correlate function does not
# normalize the correlation function, so we modify total and total2 to get
# that effect
total -= total.mean()
total /= total.std()
total2 = total.copy() / len(total)
acor = np.correlate(total, total2, 'full')
total_mut -= total_mut.mean()
total_mut /= total_mut.std()
total2_mut = total_mut.copy() / len(total_mut)
acor_mut = np.correlate(total_mut, total2_mut, 'full')
# Now generate the 'lag' axis
xdata = np.arange(0, len(total))
# The acor data set is symmetric about the origin, so only accept the
# positive lag times. Graph the result
plt.plot(xdata, acor[len(acor) // 2:], xdata, acor_mut[len(acor) // 2:])
plt.show()
Decomposition Data¶
When performing decomposition analysis, the various decomp data is stored in a separate tree of dicts referenced with the decomp
key. The key sequence is similar to the sequence for the normal
data described above, where decomp
is followed by the solvent model (GB
or PB
), followed by the species (complex
, receptor
, or ligand
) (additionally, we include delta
key), followed by the decomposition components (total, backbone, or sidechain), followed by the residue number (and residue pair for pairwise decomposition), finally followed by the contribution (internal
, van der Waals
, electrostatics
, etc.) The available keys are shown in Figure 1 (and each key is described afterwards).
Decomp Key Descriptions:
gb
All Generalized Born resultspb
All Poisson-Boltzmann resultscomplex
All results from the complex trajectoryreceptor
All results from the receptor trajectoryligand
All results from the ligand trajectorydelta
All results from delta total decomposition [ complex TDC - (receptor TDC + ligand TDC) ]TDC
All results from the total decompositionSDC
All results from the sidechain decompositionBDC
All results from the backbone decomposition#
All data from residue number#
in per-residue and per-wise decomposition (same residue numbering scheme as in each respective topology file)##
All interaction energies between residues##
and their respective pair#
in per-wise decomposition (same residue numbering scheme as in each respective topology file)
int
Internal energy contributions (see the idecomp variable description above)vdw
van der Waals energy contributionseel
Electrostatic energy contributionspol
Polar solvation free energy contributionssas
Non-polar solvation free energy contributionstot
Total free energy contributions (sum of previous 5).
Created: February 8, 2021 07:10:13