Protein-ligand binding free energy calculations with psf_dcd files¶

Info

This example can be found in the examples/psf_dcd/protein_ligand directory in the repository folder. If you didn't use gmx_MMPBSA_test before, use downgit to download the specific folder from gmx_MMPBSA GitHub repository.

In this example we are going to generate gmx_MMPBSA input files from psf_dcd files. It is strongly recommended to check how gmx_MMPBSA works before attempting any sort of calculations. See below the input files required for running gmx_MMPBSA.

Make sure to install gmx_MMPBSA in a dedicated conda environment as decribed here.

Requirements¶

In this case, gmx_MMPBSA requires:

Input File required	Type	Description
Input parameters file	`in`	Input file containing all the specifications regarding the type of calculation that is going to be performed
A trajectory file	`xtc` `pdb`	Final GROMACS MD trajectory, fitted and with no pbc.
A topology file	`top`	GROMACS topology file (The `* .itp` files defined in the topology must be in the same folder
The MD Structure+mass(db) file	`pdb`	Structure file containing the system coordinates
An index file	`ndx`	File containing the receptor and ligand in separated groups
Receptor and ligand group	`integers`	Group numbers in the index files
A Reference Structure file	`pdb`	Complex reference structure file (without hydrogens) with the desired assignment of chain ID and residue numbers

-> Must be defined -- -> Optional, but recommended -- -> Optional

Steps to generate gmx_MMPBSA files

1- The input file *.in2- The trajectory file3- The topology and the MD Structure+mass(db) files4- The index file

The input file *.in is already included in the tutorial folder, although it can be easily generated using --create_input command. The *.in input file, is a text file containing the following lines:

Sample input file for GB calculation
Input file generated by gmx_MMPBSA (v1.5.0.3)
Be careful with the variables you modify, some can have severe consequences on the results you obtain.

# General namelist variables
&general
  sys_name             = ""                                             # System name
  startframe           = 5                                           # FIRST FRAME TO ANALYZE
  endframe             = 15                                          # LAST FRAME TO ANALYZE
  interval             = 1                                              # Number of frames between adjacent frames analyzed
  forcefields          = "oldff/leaprc.ff99SB,leaprc.gaff"           # Force field(IGNORED SINCE WE ARE DEFINING THE TOPOLOGY)
  ions_parameters      = 1                                              # Define ions parameters to build the Amber topology
  PBRadii              = 3                                              # Define PBRadii to build amber topology from GROMACS files
  temperature          = 298.15                                         # Temperature
  qh_entropy           = 0                                              # Do quasi-harmonic calculation
  interaction_entropy  = 0                                              # Do Interaction Entropy calculation
  ie_segment           = 25                                             # Trajectory segment to calculate interaction entropy
  c2_entropy           = 0                                              # Do C2 Entropy calculation
  assign_chainID       = 0                                           # ASSIGN CHAINS ID
  exp_ki               = 0.0                                            # Experimental Ki in nM
  full_traj            = 0                                              # Print a full traj. AND the thread trajectories
  gmx_path             = ""                                             # Force to use this path to get GROMACS executable
  keep_files           = 2                                              # How many files to keep after successful completion
  netcdf               = 0                                              # Use NetCDF intermediate trajectories
  solvated_trajectory  = 1                                              # Define if it is necessary to cleanup the trajectories
  verbose              = 1                                              # How many energy terms to print in the final output
/

# (AMBER) Generalized-Born namelist variables
&gb
  igb                  = 5                                              # GB model to use
  intdiel              = 1.0                                            # Internal dielectric constant for sander
  extdiel              = 78.5                                           # External dielectric constant for sander
  saltcon              = 0.15                                        # SALT CONCENTRATION (M)
  surften              = 0.0072                                         # Surface tension
  surfoff              = 0.0                                            # Surface tension offset
  molsurf              = 0                                              # Use Connelly surface ('molsurf' program)
  msoffset             = 0.0                                            # Offset for molsurf calculation
  probe                = 1.4                                            # Solvent probe radius for surface area calc
  ifqnt                = 0                                              # Use QM on part of the system
  qm_theory            = ""                                             # Semi-empirical QM theory to use
  qm_residues          = ""                                             # Residues to treat with QM
  qmcharge_com         = 0                                              # Charge of QM region in complex
  qmcharge_lig         = 0                                              # Charge of QM region in ligand
  qmcharge_rec         = 0                                              # Charge of QM region in receptor
  qmcut                = 9999.0                                         # Cutoff in the QM region
/

See a detailed list of all the options in gmx_MMPBSA input file here as well as several examples

We are going to use cpptraj program from Amber to process the *.psf and *.dcd files:

Press enter after every command line

    cpptraj -p step3_input.psf
    >trajin traj.dcd
    >strip :POT,CLA,TIP3,SOD
    >trajout traj.xtc
    >run
    >exit

After executing these commands, there should be a new file in te folder, i.e. traj.xtc that is going to be used as the trajectory file.

Keep in mind

For some reason, when using charmm-gui to prepare the files, the toppar files that come inside the NAMD folder have the ATOM section commented with the ! symbol:

!ATOMS
!MASS  -1  H          1.00800 ! polar H
!MASS  -1  HC         1.00800 ! N-ter H
!MASS  -1  HA         1.00800 ! nonpolar H
!MASS  -1  HP         1.00800 ! aromatic H
!MASS  -1  HB1        1.00800 ! backbone H
!MASS  -1  HB2        1.00800 ! aliphatic backbone H, to CT2
!MASS  -1  HR1        1.00800 ! his he1, (+) his HG,HD2
...

This section can't be commented on for the purpose of this tutorial, thus you can either use a files with the ATOM section uncommented or just uncomment the section yourself:

ATOMS
MASS  -1  H          1.00800 ! polar H
MASS  -1  HC         1.00800 ! N-ter H
MASS  -1  HA         1.00800 ! nonpolar H
MASS  -1  HP         1.00800 ! aromatic H
MASS  -1  HB1        1.00800 ! backbone H
MASS  -1  HB2        1.00800 ! aliphatic backbone H, to CT2
MASS  -1  HR1        1.00800 ! his he1, (+) his HG,HD2
...

We are going to use ParmEd to convert the *.psf file into a GROMACS topology file. To do so, use the ParmEd script that is already included in the tutorial folder.

python script.py

Take your time to analyze step by step the process of converting .psf/.crd files intoa GROMACS topology with ParmEd.

# import ParmEd module
import parmed as pmd

# load psf file
psf = pmd.load_file('step3_input.psf')

#load coordinate file
psf.coordinates = pmd.load_file('step3_input.crd').coordinates

# strip ions and water
psf.strip(':POT, CLA, TIP3, LIT, SOD, RUB, CES, BAR')

# load Charmm Parameter Set. Make sure to include all the necessary force field files in this list
params = pmd.charmm.CharmmParameterSet('toppar/par_all36_carb.prm',
                                       'toppar/par_all36_cgenff.prm',
                                       'toppar/par_all36_lipid.prm',
                                       'toppar/par_all36m_prot.prm',
                                       'toppar/par_all36_na.prm',
                                       'toppar/par_interface.prm',
                                       'toppar/toppar_water_ions.str',
                                       'toppar/ben.prm')
psf.load_parameters(params)

# save GROMACS topology file
psf.save('gromacs.top')
psf.save('gromacs.pdb')

The last file we need to generate is the index file containing the groups with the receptor and ligand atoms. To do so, just use make_ndx from GROMACS and the MD Structure+mass(db) that was generated previously.

Press enter after every command line

    gmx make_ndx -f gromacs.pdb -o index.ndx
    >q

In this case, there is no need to generate new groups as GROMACS is able to parse the receptor and ligand atoms. The groups number 1 and 13 contain the receptor and the ligand atoms, respectively.

Once the gmx_MMPBSA files have been generated, the program can be run either in serial or in parallel:

SerialWith MPI

gmx_MMPBSA -O -i mmpbsa.in -cs gromacs.pdb -ct traj.xtc -ci index.ndx -cg 1 13 -cp gromacs.top -o FINAL_RESULTS_MMPBSA.dat -eo FINAL_RESULTS_MMPBSA.csv

mpirun -np 2 gmx_MMPBSA -O -i mmpbsa.in -cs gromacs.pdb -ct traj.xtc -ci index.ndx -cg 1 13 -cp gromacs.top -o FINAL_RESULTS_MMPBSA.dat -eo FINAL_RESULTS_MMPBSA.csv

Considerations¶

In this case, a single trajectory (ST) approximation is followed, which means the receptor and ligand structures and trajectories will be obtained from that of the complex. To do so, an MD Structure+mass(db) file (gromacs.pdb), an index file (index.ndx), a trajectory file (traj.xtc), and both the receptor and ligand group numbers in the index file (1 13) are needed. The mmpbsa.in input file will contain all the parameters needed for the MM/PB(GB)SA calculation. In this case, 11 frames are going to be used when performing the MM/PB(GB)SA calculation with the igb5 (GB-OBC2) model and a salt concentration = 0.15M.

A plain text output file with all the statistics (default: FINAL_RESULTS_MMPBSA.dat) and a CSV-format output file containing all energy terms for every frame in every calculation will be saved. The file name in '-eo' flag will be forced to end in [.csv] (FINAL_RESULTS_MMPBSA.csv in this case). This file is only written when specified on the command-line.

Note

Once the calculation is done, the results can be analyzed in gmx_MMPBSA_ana (if -nogui flag was not used in the command-line). Please, check the gmx_MMPBSA_ana section for more information

Last update: February 21, 2023 01:10:05
Created: March 4, 2022 10:09:43