Since ISIM relies on the two software packages MALOC (Minimal hardware Abstraction Layer for clean Object-oriented C) and APBS (Adaptive Poisson-Boltzmann Solver) you first have to make sure to have these programs installed with a directory structure as given in the APBS installation instructions. This includes that you have a common top-level directory and the corresponding environment variable "TOP" set. After you have downloaded the archive simply do the following commands on a C-like shell:
# cd ${TOP}
# gzip -dc isim.tar.gz | tar xvf -
This decompresses and extracts the archive and creates the directory skeleton with all necessary files. You should find them within the directory "${TOP}/isim/" afterwards. Then do:
# setenv GEN_INCLUDE ${TOP}/dist/include
# setenv GEN_LIBRARY ${TOP}/dist/lib
These two variables are needed to let the configuration program know, where your shared directory for header files, libraries and executables is (if you chose a different path when installing MALOC and APBS make sure to modify the commands correspondingly). Explicitly the linkage to the other applications will fail if you do not specify these paths. Then do:
# cd ${TOP}/isim/
# ./configure --prefix=${TOP}/dist
The configuration program does a lot of checks for the system environment, all of which may not be necessary. As long as the routine terminates without abort or exit-messages everything should be fine. Finally do:
# make # make install
After that you should find the ISIM-header files as well as the library and executable within the "${TOP}/dist/" subdirectory tree. At any time you can cancel the whole procedure either by doing (you have to start all over again):
# cd ${TOP}
# rm -r isim/
# rm dist/lib/{bla}/libisim.*
# rm dist/bin/{bla}/isim
# rm dist/include/isim/*
Or try (preferably, because then you only have to redo the installation steps):
# cd ${TOP}/isim/
# make distclean
In general you should read the reference publication (once it is available), if you want to know more
about ISIM. It covers the theoretical background, presents the algorithms should give you any information to understand
the program. This document is supposed to tell you, how to use it, so this part will be fairly short.
The original idea behind ISIM is as simple as old. Since it is a well-established fact that the behaviour of ions in solution can be described
within an ensemble concept provided by statistical mechanics, various schemes have been extensively used to simulate electrolyte
solutions. The grand canonical ensemble (subsequently abbreviated as GCE) has the fundamental advantage that numbers are allowed
to change (it is apparently not a good representation of a microscopic subsystem of an electrolyte solution, if the numbers of
present ions are held strictly constant, even if it is "boring bulk territory").
A Monte Carlo (MC) procedure within the GCE (then called GCMC) is normally built up by deriving probabilities of creation
and destruction for inserting or deleting particles, which are compared to a random number in order to accept
or reject the change. This decision is governed by both bulk properties (such as excess
chemical potentials of solvation or bulk
concentration) and by the microscopic sorroundings (via explicit pair interactions with other species).
Technically schemes can be separated into those treating particle types dependently (e.g. in electrolyte solutions
only neutral combinations would be subject to deletion or insertion in one elementary step to
inherently maintain electroneutrality) or independently, which is the way ISIM works.
Experience tells that in addition conventional Metropolis MC moves are needed to provide sufficient sampling
of the configuration space, which yields three types of elementary moves: inserting, deleting and moving particles.
There is
a set of parameters determining the internal balance of the whole procedure, for which - apart from
some negative statements - no "best solution" exists. Some aspects of this discussion can be found in the
section describing just these (input) parameters.
There are thousands of phenomena in biochemistry which are dominated by electrostatics. Ions in solution
always play a role here: if not in a microscopic picture then at least via bulk effects (charge compensation/screening).
Structural data on biomolecules has become more and more available. ISIM's typical application is a simulation
"droplet" containing any macromolecule of interest surrounded by an arbitrarily composed electrolyte solution.
The analysis focuses on the equilibrium state/structure of the latter with quantities like pair correlation
functions, ion number densities in full spatial resolution (e.g. to monitor occupation of binding sites) or
the mean potential (e.g. to detect deviations from continuum theories) being covered.
In order to compute interactions between explicit GCMC particles and fixed solute (i.e., its atoms with their
force field radii and charges) the electrostatic potential in pure solvent as calculated by a Poisson-Boltzmann
solver (e.g. APBS) is used as reference state (i.e., the total potential acting on a GCMC particle has contributions
from all other explicit particles (soft-core steric interaction + Coulomb's law) and from the calculated field
of the macromolecule mentioned before and from hard-sphere steric interactions with the fixed solute (step
function)).
The term "arbitrarily composed electrolyte solution" used above brings up the issue of parameterization. Every
GCMC procedure has some crucial parameter, which is closely related to the excess chemical potential of solvation.
It governs in how far target concentrations are matched and is therefore crucial for the sanity of the
procedure (when treating particle types independently like ISIM does this implies that bad parameters
might yield (on average) non-neutral solutions and therefore physically irrelavant results).
Calculating or estimating self-consistent a priori values for excess chemical
potentials is very difficult. Therefore ISIM has its own parameterization scheme, which in principle allows
unrestricted choices (of course the necessary computer time scales with complexity).
GCMC simulations are of course useful in many other problems in science. The design of the software and its coding style (a C-flavor
mimicking the object-oriented concept of C++) is chosen to allow an easy implementation of new features.
Similarly refinements of the model(s) used in ISIM should be possible without too much internal "programming hassle".
What ISIM is ...
The process should terminate abnormally with a statement like
You will always get the right format as the output of a parametrization run (see below) for the system of your interest and this is
usually the way to obtain these files, as external data is only poorly available and/or applicable for this particular purpose.
Within a normal simulation the program reads in these values only once, for off-table values multidimensional, linear interpolation
is done. Right now you cannot be sure to be able to treat mixtures of only neutral particles correctly (because the existence
of only N-1 independent concentration variables for N species (gross electroneutrality constraint) enters the procedure implicitly).
When you are working in parametrization mode you have to provide two different types of input files for explicit particles.
One gives guesses for the excess chemical potential, which the parametrization algorithm (see reference publication for details)
uses as seed values. These are N files named YOURNAME_GUESS, which are simple two-column data tables (first for concentration
values in mM, second for chemical potential values in kcal/mol) without a header line. The second one is used to specify the range
of concentrations you want to parametrize. These are N-1 files for the first N-1 species you specify in the IONS section
of your input file (see below) named YOURNAME_SCHEME. The concentrations of the last particle type are assumed to be dependent, so
make sure it is
not a neutral (and therefore independent) one. Dependent on those files and your general input file the parametrization run will
perform the necessary empty-box (bulk) simulations. For more details please refer to the reference publication and the input
parameters section.
For runs in normal working mode another preliminary restriction is the data format for atom-composed macromolecules, which has to be PQR.
Fortunately there is a web-based conversion routine for PDB files
(with some limitations concerning atom types, ununsual contents), which you can use to overcome this difficulty. It removes water and adds
hydrogens (see references mentioned on the website), but has no rule for removing ions or crystallization artifacts, which you partially might be willing
to do manually instead. Then you simply have to copy the PQR-file into your job directory.
As outlined in the introductory section
ISIM's energy terms are split into three parts. The inner electrolyte energy is
calculated as the sum of all pair potentials in a combination of Coulombic and Lennard-Jones interactions, the steric interface energy
with the macromolecule is in a simplest assumption a step function (better models at this point represent a severe numerical challenge when
cutoffs are to be avoided) and finally the electrostatic interface energy needs something like a electrostatic potential of the macroion
in pure solvent. This last component is provided by APBS and therefore you will need another input file containing just that quantity grid
written out onto a grid in dx-format. Eventually you will want to compare Poisson-Boltzmann theory with the ISIM results (since the excplicit treatment
has the difference of microscopic detail and steric restrictions). Given that case you will have to run another APBS calculation with implicit ions.
ISIM is able to write out deviations between both results on its own analysis grid (for more details on these refer to the description of the
input parameters).
Once you have created all necessary files and specified their
paths in the input file you should be able to do an ISIM run, which does not terminate with a statement like the one above. The following subsections
cover aspects of the input file preparation in detail.
Summary of input file specifications (many parameters are self-explaining so use this only as reference):
The various input parameters are described in the following paragraph.
The various input parameters are described in the following paragraph.
The various input parameters are described in the following paragraph.
The various input parameters are described in the following paragraph.
The various input parameters are described in the following paragraph.
In a parametrization run there are only few relevant files. The two files for monitoring the job's progress are more important than in normal simulations,
as a parametrization will always be a series of several single simulations within the framework of a processing scheme, which creates
additional opportunities for input interpretation, mistakes and so on ... . The other ones include the following:
Most of the other files within normal simulations are essentially the results of analysis functions and only meaningful in case this type of
analysis has been performed properly (if applicable). The general concept is that the user makes selections determining the ouput frequency.
Underlying restrictions are the choice of equilibration steps (at this point all averaged quantities are flushed) meaning that some files in
the course of a simulation might contain little or no data and the irrepressible output of a standard set of files in the very last step (these might
be redundant sometimes, but make sure that a run is less likely to be completely wasted). In detail the output includes:
The most natural program to work with is the IBM data explorer (DX). However, it follows the common rule that an increase
in flexibility always implies a decrease in comfort using it. Consequently this section covers some of the things
to keep in mind when using the "visual programs" (working scripts in DX), which can be found in the subdirectoy
"examples/visual/". All the *.general output files mentioned before allow DX to read in the
grid-based data. If you look at these configuration files you will notice a line like
To use the visual programs in the example directory, switch to the simulation directory (with all the necessary files)
and do either
OF COURSE DX provides a lot more options and features, which are ignored here, although they might be useful.
For very specific tasks it is also possible to write new modules (in C). Furthermore the visual programs are far from being
optimized and are - just like this section - thought to allow users, who do not know DX, to visualize their results
without getting too much in contact with the difficulties using DX.
How to make ISIM run ...
There are some things you have to make sure, before you can use ISIM. Since ISIM creates a lot of output files with filenames
independent of the job name, it is useful to create a directory for the job. In this directory either create an input file (see
below) or copy one of the input files from the "examples/" subdirectory into it and make the necessary modifications.
Then you can do something like:
# .${TOP}/dist/bin/{bla}/isim yourjob.in
Files for excess chemical potentials invalid.
in the file "yourjob.ERR". You have to provide external files in the working directory named YOURNAME_EXCESS, where YOURNAME
is the name you gave an explicit particle within the IONS section (also see below). The format of these files is relatively
easy. Just give N+1 columns with the first N of them containing the concentration values (in mM) for a particular scheme
and the last one the excess chemical potential value (in kcal/mol) for the species the file is made for. Two (possibly annoying)
features are the necessity for a header line and that the interpolation procedure relies on a strict order of the
concentration values.
An example file for one within a system of three species over a narrow concentration range might look like:
# SODIUM MAGNESIUM CHLORINE kcal/mol
120.0000 35.0000 190.0000 -0.2123558011
120.0000 40.0000 200.0000 -0.2159573955
120.0000 45.0000 210.0000 -0.2222860737
130.0000 35.0000 200.0000 -0.2116926512
130.0000 40.0000 210.0000 -0.2183961345
130.0000 45.0000 220.0000 -0.2310599633
140.0000 35.0000 210.0000 -0.2124611651
140.0000 40.0000 220.0000 -0.2185297560
140.0000 45.0000 230.0000 -0.2336664020
#GENERAL
FORM S
RADIUS 66.0
LENGTH 180.0 /*redundant, since 'S' is selected*/
TEMPERATURE 298.0
MEDIUM_PERMITTIVITY 78.36
GENERAL_MODE N
#ENDGENERAL
#TECHNICAL
TOTAL_STEPS 1000
MINIMIZATION_STEPS 35
CREATION_DESTRUCTION_CYCLES 10
SHUFFLING_STEPS 10
SHUFFLING_MODE N
SO_MANY_ARE_SOME 40 /*redundant, since 'N' is selected*/
MAXIMUM_DISPLACEMENT 2.0
PARAMETRIZATION_MINIMUM 50
#ENDTECHNICAL
#ANALYSIS
EQUILIBRATION_STEPS 8000
GRID_RESOLUTION 60
COORDINATES_OUTPUT 2500
NUMBER_DENSITIES_OUTPUT 2500
CHARGE_DENSITY_OUTPUT 10000
POTENTIAL_OUTPUT 2500
PAIR_CORRELATION_OUTPUT 2500
POTENTIAL_FREQUENCY 50
PAIR_CORRELATION_FREQUENCY 50
PAIR_CORRELATION_RESOLUTION 0.3
PAIR_CORRELATION_SHELLS 5
#ENDANALYSIS
#IONS
MODEL L
TYPES 2
CALCIUM 75.0 2.0 2.41 0.450
CHLORINE 150.0 -1.0 4.86 0.168
#ENDIONS
#MACROMOLECULE
TYPE_ID P
STERIC_MODE S
BORN_ION_CHARGE 30.0 /*redundant, since 'P' is selected*/
BORN_ION_RADIUS 10.0 /*redundant, since 'P' is selected*/
PQR_FILENAME somename.pqr
PQR_DX_POTENTIAL_FILENAME someothername.dx
PQR_DX_REFERENCE_POTENTIAL_FILENAME yetanothername.dx
PQR_GRID_RESOLUTION 60
PQR_OUTSIDE_MESH_MODE 1
#ENDMACROMOLECULE
What ISIM spits out ...
ISIM has the sometimes annoying capability of inevitably creating a lot of output files. Job monitoring files, energy files and number
files are essentials (meaning that
you cannot control their existence). In detail these files, which are visibly updated every 100th step, include for a input file named
YOURFANCYJOBNAME.IN:
How to create fancy pictures ...
The output files described in the previous section are generally in very simple formats. Therefore simple
plot programs are usually sufficient to visualize two-dimensional data. The grid-based information needs
more advanced visualization software, but there are no fundamental restrictions, since - if necessary - the data format
can always be modified (scripts) to match individual requirements.
file = GRID_POTENTIAL_0150000
which is the pointer to the
corresponding data file. By default this is the last output of its type (i.e., after the last step). If you want to
use files created earlier in the simulation, just change this line; everything else in the file should be
correct and sufficient, unless you have special needs like looking at selected parts of the data (grid).
The only exception to this rule occurs if you want to add explicit coordinate representations (for example snapshots
of ion configurations or selected atoms of a macromolecule) to your pictures,
because there simply is no configuration file. In case you have the simple data format
x | y | z | val
where val is some value identifying the particle, whose position is specified by the x, y and
z values, you would create a file yourname.general like:
file = /yourlocation/yourdatafile
points = N
format = ascii
interleaving = field
field = locations, ID
structure = 3-vector, scalar
type = float, float
dependency = positions, positions
end
Here N is the number of entries in your data file (for instance the number of ions in the snapshot file).
cp ${TOP}/isim/examples/visual/basicisim.* .
in case you do not want to include explicit coordinate representations and
cp ${TOP}/isim/examples/visual/coordisim.* .
in case you do want to.
Then you can simply run DX in your current working directory. Select Run Visual Programs... from the menu and
choose the corresponding program in the file selection dialogue. This will start the automatic execution of the
program with the default values. Most likely you might want to select Reset Server from the Connections
menu in the master panel (stops execution). All the selections to be made in the program's main control panel are
self-explaining and after choosing the desired values/modes you can just run Execute Once from the
Execute menu. It might take some time (speed is of course strongly dependent on grid resolution) to create
the images. Once they are there their appearance can be adjusted (zoom, rotation, ...) by using the View Control
panel from the Options.
What's not so nice about ISIM (currently) ...
Some comments about the limitations of the software and especially its underlying, theoretical assumptions certainly have
to be made (the list might be rather disordered):