It looks as though the molecular dynamics simulations are nearing an end. Think I better get a move on with the scripting business. Unfortunately, no one really writes books with code for the newer languages, so I'm stuck reading Computer Simulations of Liquids by Allen. It's a very helpful tool, but the pseudo-code they give is for FORTRAN and can be a bit unclear. This is especially the case when dealing with for loops and indexing entries in an array based on the values in the loop. You have to learn bits and pieces of nearly every programming language that's made its way to being used in computational science. Books today on languages like Python pretty much suck... I don't care how to plot y = xsin(x), or create an array with all of the y-values when x ranges from 0 to 50. No one does the job the guys of old did when they literally wrote a book that you could code your own fully functional molecular dynamics program from.
My previous coding experience was with MATLAB and so I use GNU Octave now for anything I need plots for. My main programming language is Python... I tried C for a bit and while the indexing issues became non-existent with C (it's very intuitive), debugging is an absolute pain. My first C script I spent 20 minutes debugging the compilation... it created my executable, whew I said relieved. I executed the program and ERROR. So, now I had to debug the executable with the most ambiguous errors I've ever seen. I'll take Python and its quirky indexing any day over C's debugging. We're on a tight time budget here, especially with this NSF grant we're trying to push a paper out for, and I'd rather have myself writing programs in a language that puts a comma on the line below where you have an error, estimating the EXACT spot you have the error.
So right now I'm translating a bunch of old C codes to Python for my use in analyzing our trajectories. The simulations are of 214 waters and an ion, Cl Br F Na and Li. We're using the AMOEBA polarizable forcefield in this work, which allows us to calculate the induced dipoles on every atom in the simulation. Unfortunately this calculation cannot be split into chunks, so it cannot be run in parallel... so it's MUCH slower. When I worked with proteins, I was able to simulate ~1 nanosecond per week in serial (3-4000 atoms + thousands of waters), but I'm able to get only 500 picoseconds in a week with AMOEBA on 643 atoms. After obtaining the trajectories we start running some analysis, first we center and reimage the trajectories. So we take the ion and translate it back to the center, then translate all other atoms by the same amount. Reimaging refers to the fact that the simulations we run are in periodic conditions, which means we have a cube with sides of length, L, that our system evolves in. This cube is propagated in every direction (+x, -x, +y, -y, +z, -z) and when a molecule leaves the cube, it simply re-enters the adjacent cube:
| | <--- cube wall
| c| <--- molecule leaving the cube
| | <--- cube wall
|c | <--- location molecule enters the adjacent cube
| || |
|c ||c | ---> now every image of the cube sees the
| || | molecule in this new location
However, sometimes the water doesn't get moved back correctly and a few waters diffuse away a little bit, so we just reimage them back inside the box where they SHOULD be located. You merely translate the diffused waters' X, Y, and Z coordinates by the length of the box, L. It's a very important step, especially if the ion moved its way over to the edge of the box and was coordinating them.
After reimaging we start doing some math... I have a script that calculates a radial distribution function and smooths it, obtains info on the water geometry and plots things like bond angle of the water as a function of the oxygen to ion distance, and we script the setup of 100's of Gaussian09 input files. Gaussian09 is a quantum chemical package. From the radial distribution function, you can calculate the first hydration shell coordination number (integrate the first maxima), but we also use a simple hydrogen bonding scheme as well--if the water has an H closer than ~3 - 3.3 Angstroms (depends on the ion) and the O-closer H-ion bond angle is > 130 degrees, we count it as coordinated. This allows us to input only the waters the ion is coordinating into Gaussian and treat the remaining waters as point charges (no electrons, no basis sets). We run Gaussian to compute the electron density at the MP2 level of theory with the aug-cc-pVDZ basis set. Gaussian spits out a cube file, which contains the electron density at various points on a fine grid. We actually do multiple runs on the same system (full cluster, water only, ion only) to get 3 cube files. After that we use another program that performs Bader analysis on the cube files. This allows us to compute the distribution of charge across all the molecules in the system and being to visualize a concept known as charge transfer. When an ion like Cl is hydrated by a single water, we've calculated that ~20% of the ion's excess charge is transferred to the water.
You can then visualize the crap out of this concept with electron density difference maps which actually show this excess charge having been spread over the coordinated water.
This work is looked forward to by a number of extremely important people in the business and we think we have a finding that will get some people really riled up. So we're actually doing the study AGAIN! to verify our results weren't just a fluke.
After that, I'll be moved back to my old project of calculating thermodynamic quantities of ion hydration from a quantum chemical prospective using symmetry adapted perturbation theory.