Some M.Sc. (GU) Class-Notes on Computers in Chemistry

Prepared by Rituraj Kalita, Cotton College, Guwahati (India) in 2006
 

View an allied presentation on Chemistry in Computers: from Educational Tools
to Computational Chemistry

View an allied resource material stressing on molecular modelling and computation
in actual practice
(delivered at the Computers in Chemistry workshop, June 2006, CCG)

Download (free of cost) the Computers in Chemistry Experiments Software Set (needs
Visual FoxPro 6 Runtime, and includes justification of Job's method, rotational energy-level
probabilities, molecular (Maxwell's) speed distribution, and inter-nuclear P.E. of H2+ ion -
download chem_expt.zip to C:\ and then using WinZip extract it to C:\chem_expt easily).
Elementary Programming Concepts (with Fortran and FoxPro examples)
More Fortran Programs for the G.U. 3rd Semester Computers in Chemistry Paper
Access the Related G.U. M.Sc. 4th Semester Physical-Chemistry Experiments 2005 Manual
Access the List of G.U. M.Sc. 3rd Semester Physical-Chemistry Experiments 2006-07
 

Modelling Molecules and Reactions in Computers

Computational chemistry attempts making models or simulations of the chemical species and chemical processes in the computer. As any other branch of science, it also use such models to interpret the naturally occurring chemical phenomena, and tries to predict yet undiscovered chemical phenomena as well.

Within this branch, let us discuss the ab initio quantum mechanical modelling of small and medium-sized molecular systems in some details, considering its current popularity. In this field, the mathematical (numerical) model of the molecule is always associated with a visual model to be viewed by the user: given the numerical model the corresponding visual model may be obtained using a graphics software (e.g., PCModel, ArgusLab, Ortep-3 etc.), while every visual model drawn or generated anyhow can be saved as (transformed into) the corresponding numerical model. The visual model helps in easy understanding of the molecular structure and stereochemistry; it attempts to represent the actual molecule as closely as possible in its shape and stereochemistry. There are provisions to view the models in different styles such as stick, ball and stick, electron dot surface etc. Besides, generally the visual model may be viewed from different angles (orientations) and in different enlargements: the structural figure drawn on a paper is thus a rather poor visual model, compared to those made with such chemistry-specific packages.

As a starting point, in this field we start with molecules lying in the vacuum (gas phase); it is possible to introduce corrections to this gas-phase model for the surrounding solvent medium etc. From quantum mechanics, we know that there is no question of the electrons in the molecule to be specified of their positions: we need to specify only the number of electrons in the molecule, while the electron probability density will be obtained from solution of the electronic Schrödinger equation (ESE). On the other hand, there’s the necessity of specifying the positions of all the nuclei (in addition to specifying the types and numbers of the nuclei) – as from the Born-Oppenheimer approximation we know that to construct the ESE the nuclear framework must be specified. Besides, with the same set of nuclei and the same number of electrons different isomers may arise, if the relative nuclear positions are allowed to vary. Thus, the molecular model must include the nuclear coordinates, in addition to the types and numbers of nuclei and the total number of electrons. Using such a molecular model, the molecular electronic wavefunction and the electron probability density can be directly found (it’s just a matter of time) by solving the ESE, thereby arriving at a complete description of the molecule.

 However, the total number of electrons may not be explicitly mentioned in the molecular model (say, in Mopac), in which case it is understood that the molecule is electro-neutral i.e., there are just sufficient number of electrons (say 26 in ethanol) to keep the molecule uncharged. In other cases (say, for Gaussian) the charge of the molecular system needs to be specified, meaning that the number of electrons thus gets understood. Coming to the question of nuclear-framework specification, we see that the nuclear framework part of the molecular model is specified in mainly two different formats. In one format, the type (say atomic symbol, such as H or N) of each of the nuclei along with its Cartesian (x, y, z) coordinates (separated by space) are specified one by one for all the nuclei. Generally, the molecular model is in the form of a text file, with one line each dedicated to the description of each of the nuclear type & position. The unit of the coordinates is, practically universally, Angstrom (not atomic unit). Thus, the nuclear framework specification for a water molecule may be as follows:

O          0.000000           0.127174          0.000000
H          0.758132          -0.508697          0.000000
H         -0.758132          -0.508696          0.000000

In the other format called the z-matrix specification, the position of the first nucleus is kept unspecified. The position of the 2nd is expressed in terms of the distance from the 1st. The position of the 3rd is expressed in terms of the distance from the 2nd or the 1st, and in terms of the bond angle amidst the 3rd, the 2nd/ 1st, and the 1st/ 2nd nucleus. All the rest of the nuclear positions require specification of one bond distance, one bond angle and one dihedral angle (i.e., angle between two planes, say between the 1-2-3 and the 2-3-4 nuclei-connecting planes). The justification of such a (initially incomplete-looking) z-matrix specification is obvious: it is because translating the whole nuclear framework to another position or rotating it doesn't lead to a different molecule! It is the Cartesian coordinate specification format where there is rather too much of coordinates specification (over-specified by six degrees of freedom), but there also it is understood that translation or rotation of the whole framework creates no different molecule. This second format of nuclear framework specification also has each line describing one nucleus, with unit of distance and angle being Angstrom and degrees by convention, as exemplified for H2O:

O
H         1          0.989493
H         1          0.989492           2          100.024728

Here naturally one starts with preparations of the molecular models. Using model-drawing software packages such as PCModel or ArgusLab etc. such models may either be drawn from scratch OR be modified from pre-existing models such as of aliphatic/ aromatic rings, amino-acids, mono-saccharides, nucleic-acid bases etc. Such drawing or modification generally involves quite user-friendly steps, as may be exemplified in case of PCModel. In PCModel, there are onscreen buttons such as Draw, Select, Del, Update, Show/Hide-H etc. To do an operation such as adding a bond, one needs to click at Draw, then at the starting nucleus and then at the new nuclear position. (To delete an existing nucleus, one needs to click at Del, then at the nucleus.) A new nucleus drawn is assumed initially as a C-nucleus, after which it may be altered by invoking a periodic-table entry. The H-atoms needn't be drawn; they're just understood and may be shown/ hidden at will. Gross structural mistake(s) made in drawing or modification (e.g., creation of absurd bond-lengths etc.) may now be corrected by invoking an inbuilt, raw energy-minimisation procedure. After drawing or modification of the visual model in this way, the mathematical model may be immediately obtained in different CC formats such as Chem-3D (one of the former class, as above) or Mopac (one belonging to the latter class) etc.

The mathematical models thus formed are then fed into computational software packages such as Mopac, Gaussian or GAMESS etc., the desired level of computational theory (say, Huckel/ Hartree-Fock/ Moller-Plasset 2nd-Order etc.) is specified or kept understood, and the lengthy computational process is allowed to go on. Through such extensive computations, modern computational packages such as Gaussian or GAMESS can predict many properties and reactions of molecules such as: molecular energy & structures, transition-state energy & structures, vibrational frequencies, reaction energies, potential energy surface (PES) & reaction pathways etc. It may be noted here that such a computational package may perform computations in two distinctly different ways: (i) the nuclear-framework may be considered as exactly fixed and so no modification is attempted into it (called a single-point calculation) OR (ii) the nuclear-framework is considered to be modifiable, and so the optimum molecular structure is sought for through its modification (called a structure-optimisation calculation). It is the second way through which we can look for theoretical prediction of chemical reactions, because we know that it is the change in the relative nuclear coordinates that imply a chemical reaction (e.g., H2 (g) + I2 (g) = 2HI (g) implies that the H–H  & I–I distances have increased much and the H–I distances have decreased much).

How exactly is the reaction between two or three molecules modelled? To proceed with such modelling, at first the model of a relevant supermolecule, which includes all the reacting molecules, has to be constructed by joining the individual molecular models. For example in case of the H2-I2 reaction, the supermolecule must include the combination of an H2 molecule and an I2 molecule.
 

Chemical Drawing, Modelling & Computation Software

Mopac: Mopac is a general purpose semi-empirical molecular orbital program for the study of chemical structures and reactions. Developed by the research group of Professor Michael J. Dewar, it uses semi-empirical quantum mechanical methods which are based on Hartree-Fock theory with some parameterized functions and empirically determined parameters replacing some sections of the complete H-F treatment. The approximations in semi-empirical theory result in more rapid single energy calculations which allows much larger structures to be studied. Mopac can use the semi-empirical Hamiltonians MNDO, MINDO/3, AM1 and PM3 to obtain molecular orbitals, heats of formation and their derivatives with respect to molecular geometry. Using these results Mopac can calculate the vibrational spectra, thermodynamic quantities, dipole moments and molecular orbitals and electron densities. Since Mopac is an quantum chemistry program it recognizes electron reorganization and thus Mopac can be used to study reaction mechanisms. This is in contrast to molecular mechanics programs which can only deal with a particular valence representation of a molecule. 


ORTEP: The Oak Ridge Thermal Ellipsoid Plot (ORTEP) program is a computer program, written in Fortran, for drawing crystal structure illustrations. Ball-and-stick type illustrations of a quality suitable for publication are produced with either spheres or thermal-motion probability ellipsoids, derived from anisotropic temperature factor parameters, on the atomic sites. The program also produces stereoscopic pairs of illustrations which aid in the visualization of complex arrangements of atoms and their correlated thermal motion patterns. ORTEP-I was distributed as early as 1965, and  the 1996 version, ORTEP-III, retains the old ORTEP ideas, which is basically to provide a crystallographic language for writing the input data file describing the crystal structure illustration. ORTEP is noted for its versatility and high quality drawings. 


                        Fig: An ORTEP display diagram for CCl4

PLUTON: PLUTON is a program for interactive molecular graphics. It is designed for use in a crystallographic context. However, it can be used for the display and analyses of molecules in an orthogonal (Angstrom) coordinate system as well. Method of use: Pertinent data are given on/ read from a file (normally prepared either with an editor or by another program). The data may be in several formats including those of the supplied test-data files. Formats that should work are: CIF, shelxl-res, PDB, Cart-3d, Fdat. A PLUTON session consists of a cyclic process of content/style/viewpoint type of modification instructions and a PLOT instruction to inspect the result of the current settings. 

ChemDraw: Over the last 20 years, ChemDraw has remained the worldwide industry standard chemical drawing and analysis package. Naturally, the new ChemDraw Ultra 9.0 is far more than just a drawing tool. The software provides a host of powerful features invaluable to modern chemists, and is designed to increase productivity in just about any scientific environment. This sophisticated package nevertheless continues to deserve its long-established reputation for accessibility and reliability. This program offers easy and quick drawing of chemical structures, suitable for both everyday use and for publication or presentation. As well as the chemically meaningful drawing capabilities, a range of tools for building non-chemical structure drawing is provided, such as curves and basic geometric shapes. These all exhibit good control and flexibility such that any drawing can be easily accomplished with high quality. A very useful TLC drawing tool is also present, allowing easy documentation and sharing of the ephemeral, but information rich TLC plate. ChemDraw imports and uses a large number of files as well as its own native format: ISIS/Draw sketches, MOL files, JPEG images, etc can be used in a drawing, and spectra, produced by almost any instrument, can be opened directly.

ISIS/Draw: ISIS/Draw is a chemical drawing program somewhat similar to ChemDraw, though it works in a slightly different way. When ISIS/Draw is started, a window appears with a list of tools down the left hand side of the window, and a list of templates along the top. There are two buttons towards the top left of the window: Molecule and Sketch. To draw chemical structures, make sure that Molecule is selected. If you choose Sketch, you get the opportunity to draw other shapes which may be useful to annotate and beautify diagrams.
For example, consider the benzene drawing tool. Click on this tool, then click anywhere in the drawing window. A benzene ring will be drawn, at a standard size, from a single mouse click. Click once on any atom of the benzene ring, and a new ring will appear, connected by a single bond. Click on any bond of the benzene ring, and a new ring will appear, fused to that bond. The bond drawing tool works in much the same way as in ChemDraw. Click on  to select this tool, then click anywhere on the screen. A bond will be drawn by this single click. If you click on an atom, you will extend the chain. Click on an atom in a benzene ring to make toluene. Click on a double bond in a benzene ring to reduce it to cyclohexadiene.
 

Accessing Chemical Resources from the Internet
(Click here for a version of this section allowing tutorials as well!)

The Internet offers an unbelievably huge and largely systematic collection of chemical information and other chemical resources, a large and significant part of which is free for anyone. However, accessing it properly requires some knowledge on the part of the user. We may classify the web-available chemical resources into the following types: (i) background conceptual knowledge, generally of the undergraduate and master's degree level (ii) Papers in journals & seminar-proceedings describing original research work (iii) web-databases specifying and describing specific molecules & macromolecules (iv) software-packages such as ORTEP, Protein Explorer, ArgusLab or GAMESS etc. that helps in visualization, drawing, computation or even virtual experimentation in chemistry.

The background knowledge (similar to chemistry textbooks) web-storehouse in chemistry is mostly free and is already quite huge, yet it is growing fast. This is happening thanks to contributions from individual authors and via the philanthropic initiatives of web-encyclopedias (e.g, encyclopedia.com, wikipedia.org, britannica.com), science-organizations (e.g., scienceworld.wolfram.com) & university departments (e.g., chem.ox.ac.uk, science.uwaterloo.ca, wou.edu). To search for  knowledge of a chemistry topic such as ionic solids, chlorofluorocarbons or liquid crystals, one may enter into one of these websites. After logging in, one comes across its opening screen with a space to type in the required topic to search for. After entering a topic such as any of the above three, several links to several articles, each article ranging from half a page to a few pages are shown, the first one of which is generally the most relevant. Special mention must be made about the book-storing initiative (books.google.com) by Google, which initiative is making available even pages from the printed priced books anywhere in the world, free of cost to anyone interested. An alternative way of locating such knowledge is to enter a free Search Engine, the best of which includes google.com, altavista.com and rediff.com, and similarly search for the topic therein. This would, however, lead to links about information almost everywhere from the Internet, but many of which links may not be actually relevant. After entering the required topic in the form of liquid+crystals or organic+reaction+mechanism etc. (the plus sign between two words ensures the existence of both words in the found sites), one comes across several, sometimes too many, links to sites of organizations, universities or individuals that deal with the required topics, in elementary ways or in advanced ways.

Coming to scholarly chemistry journals and proceedings, one may locate sites of different chemistry journals by searching for chemistry+journal in any good search engine. Of highest relevance are the sites www.rsc.org/Publishing/Journals and  pubs.acs.org/journals, leading to the multitude of very good chemistry journals published respectively by the Royal Society of Chemistry in Europe and the American Chemical Society in USA. One may even search for keywords in the title/abstract (e.g. potential+energy+dimer) or author (e.g. Timothy+Young) of papers in these journals, by entering the words in the space provided for advanced searching within title name, abstract or author names. To search within most of the important journals in the world in one go, the best way is, however, to enter into a science-specific, academic-specific or chemistry-specific search engine (in contrast to general search engines which do not locate journals and proceedings) such as scirus.com, scholar.google.com or chemweb.com. Everywhere, the advanced search options are generally more helpful, and one might need (in some cases) to become a member, mostly free of cost, to search in some of these search engines. A characteristic feature of these sites (e.g., scirus.com for science-research) is the free availability of even the abstracts, in addition to the titles, author-names, journal-details etc. After getting name of an author of two or more papers in a subject, a good idea is to search for other papers written by that author, by entering his/her name as a keyword. [This helps in locating papers that actually dealt with the subject mentioned in the keywords, but does not explicitly contain the keywords in the title/abstract of the paper.]

There are web-databases, mostly free, that systematically stores the (common & chemical) names, structures, properties and reactions etc. of chemical molecules (i.e., substances). Foremost among them are CrossFire Beilstein (www.beilstein.com), PubChem (pubchem.ncbi.nlm.nih.gov) at NCBI (USA), and NIST (USA) Chemistry WebBook (webbook.nist.gov/chemistry). For macromolecules of biological origin, including proteins, nucleic acids, carbohydrates etc., there are the Protein Data Banks (PDB-s) which store and freely disseminate the 3-D structural and other relevant data of such macromolecules. The original three such data-banks in USA (called RCSB PDB, rcsb.org/pdb), Europe (called MSD EBI, ebi.ac.uk/msd) and Japan (called PDBj, pdbj.org) are now collaborating to form a single, combined store of bio-macromolecular data, known as Worldwide Protein Data Bank (in www.wwpdb.org). Both of these classes (chemical and bio-chemical) of web-databases may be searched by using common names, chemical names or codes of the molecule/ macromolecule.

Aforesaid software-packages could be obtained free of cost from the individual or institutional website of the developer scientists. Molecular or macromolecular structures obtained from the above web-databases will be hardly meaningful if a visualization package such as ORTEP-3 (chem.gla.ac.uk/~louis/software/ortep3) or Protein Explorer (umass.edu/microbio/chime/explorer) is lacking. Any chemistry educator or student should also be armed with a 3-D molecular model-making package say ArgusLab (planaria-software.com), a molecule & reaction drawing package say ISIS/Draw (mdli.com) and a computational package say PC-GAMESS (classic.chem.msu.su/gran/gamess). All these resources could be simply downloaded from the respective website free of cost. Here mention should also be made of the virtual chemical experimentation packages available from the web, including even a virtual NMR spectrophotometer!