Cluster Molecule Tutorial

Introduction
Pitfalls – it’s not your usual molecular code
Examples

Introduction

The purpose of this Tutorial is to introduce the user to the effective use of the code for finite molecular/cluster (dimension=0) problems.

Quest is different, fundamentally different, from conventional molecular quantum chemistry codes. The key difference: treatment of periodic boundary conditions was designed into the code from the outset, so as to efficiently handle bulk and slab periodic systems. To handle all system dimensionalities from molecular (dim=0) to bulk (dim=3) on an equal footing, it is necessary to be careful in problem assembly to be sure that the necessary boundary conditions are applied properly, and it is this concern that distinguishes Quest from quantum chemistry codes. Conventional quantum chemistry codes explicitly assume finite boundary conditions. In Quest, the finite boundary conditions must be specifically selected and imposed, and the user must select this in problem assembly.

Quest is a supercell code. However, Quest is different in its treatment of finite molecules from most traditional supercell codes (e.g. plane-wave methods). Supercell codes typically suffer from the problem that artificial periodic images in non-periodic directions generate spurious Coulomb fields that corrupt the local potential around a molecule. In essence, supercell codes do molecular calculations (very expensively, usually!) as molecular crystals. Quest, on the other hand, does a very different calculation for a molecular (dim=0) vs. molecular crystal (dim=3), even with an identical molecular configuration and supercell! The boundary conditions are different. Quest automatically and exactly eliminates up to spurious dipole interactions using the Local Moment CounterCharge (LMCC) method desribed in:
“Local electrostatic moments and periodic boundary conditions”,
P.A. Schultz, Phys. Rev. B 60, 1551 (1999).
But to set up the finite boundary conditions requires that the input be constructed within certain guidelines.

The central consideration in constructing a viable molecular (dimension=0) calculation is the construction of the supercell (actually, better to think of it as an integration box), and the alignment of the molecule within the supercell so that the molecule and its electron density is entirely contained within the integration box. The code cannot do this alignment for the user. The user must do this alignment of the molecule in the integration box so that the molecule and its electron density to not overstep the box boundary.

Units

The default units for energy are Rydberg (1 Rydberg = 0.5 Hartree = 13.605 eV), and for distances are bohr (1 bohr = 0.52918 Ang = 0.052918 nm). All input coordinates will be in these units, though one can use the scaling functions in the code to aid in the conversions.

Input files

The code uses one main input file that the user must construct. Atom (potential+basis) files, which will need to be listed in the input file, can be obtained from atom libraries. The input file is a text-rich keyword-driven document that, once constructed, is self-documenting. The input is a sequence of keyword lines with the indicated data on following lines. Only the first few (6 or 8) characters of the keyword line are significant. The remainder of a keyword line can be used for additional documentation. The data itself, in general, is free-format. The input in the setup data section is highly structured, and requires its input in a very specific order. Other sections take their data in any order. If you miss putting something in the right place in the input, not to worry. The code will very happily process the input file, and stop when it cannot find required input in the right place. It echoes the input file as it reads it, and will politely tell you what it was looking for when it runs into a problem. Interspersed among the required inputs, are a variety of usually invisible optional input sections. The code uses a variety of defaults that can be overridden in the input file. The use of many optional inputs will be illustrated in the tutorial.

Pitfalls – it’s not your usual molecular code

For a molecular calculation, a couple of items for are critical enough and different from standard molecular quantum chemistry codes to bear repeating:

Distance unit is Bohr, NOT Angstrom
Many if not most molecular quantum chemistry codes, and some periodic codes, use Angstrom units for distances. Quest uses atomic unit distance units called Bohr: 1 Bohr = 0.529177 Angstrom
Quest is a supercell code
This means that construction of the integration box (supercell) is central to constructing a viable molecular calculation. Many of the examples below, in addition to discussing the specific feature in the example, will include a discussion of the supercell/molecule construction.
Default is LDA, spin=0
The code does NOT automatically infer that an odd number of electrons means a radical. E.g., it will treat a hydrogen atom as spin-zero, half an electron spin-up and half spin-down, unless you specify otherwise.
Performance issues – do not panic!
For small molecules, the code can be very slow in comparison to conventional molecular quantum chemistry codes. Do not despair! While this is a very slow way to solve a hydrogen molecule (though far better than the truly wretched performance of a plane wave calculation for the same!), the favorable scaling of the method means that the code will quickly begin to outperform standard molecular codes as you increase the number of atoms in your system.

Example: A basic input file – the H atom

Illustrated in this first example is a very basic file for doing the simplest possible finite calculation: an isolated hydrogen atom.

Command options

The input file begins with a sequence of instructions to the code about the sort of calculation it will be doing. In this example, we will “do” a “setup” (the iteration independent integrals), and “do iters” (self-consistency), and “no force”. The “setup data” is the instruction to the code to read the data that describes the system. Please do not try to “do iters” with “no setup”!

Setup phase data

This section begins with a title line(s). The keyword “notes” signals that the next line is a text line that describes the nature of the problem. The keyword “end_notes” signals the end of the notes section The notes keyword and title lines are optional, but highly encouraged. Since the entire input file will be echoed into the output listing file, this provides a useful output record of the nature of the calculation.

The first required data is the “dimension” statement.
We are doing a molecular problem with finite boundary conditions: the dimension is “0”. If we wanted to do a slab calculation, we would enter a “2”, or if we were doing a bulk (or a molecular crystal!) calculation system with full periodic boundary conditions, we would enter a “3” here.

The next required input is “primitive lattice”.
This is the integration box (the supercell) for the calculation, and must be large enough to contain the molecule and all its electron density. Conversely, every atom in the molecule must be far enough away from every box boundary that its density does not significantly overlap the boundary. A typical rule of thumb: density around “hard” atoms extends about 9 bohr from an atom, around main group atoms about 10-11 bohr, and around metals about 11-12 bohr for the basis sets that Quest typically uses. Shorter ranges can be used, but if an atom is placed outside the box, or so close to a box boundary that the code estimates that the calculation fidelity is compromised, the code will stop the calculation and report a problem.

Hydrogen is a “hard” atom, and therefore we need an integration box that extends about 9 bohr in every direction from the hydrogen. Here we use a cube 18 bohr on a side. Note: the cubic supercell is only a convenience, not a requirement. For a dimension=0 molecular calculation, we can make the supercell whatever shape we want.

The next required input is “grid dimensions”.
The code evaluates many of its integrals on a regular space (fft) grid. This regular grid is defined by taking the primitive lattice vectors, and dividing them into grid intervals. Here we use a 60³ grid, for a grid spacing of 0.300/point, and 216000 total grid points on the integration grid. The more grid intervals, the more accurately the integrals are evaluated, but, also, the more expensive the calculation. A general rule of thumb: you want spacing between points of 0.30-0.40 bohr for most systems, and 0.20-0.30 for finer accuracy in systems containing “hard” atoms. Hard atoms include late first row atoms like N, O, F, or late 3d atoms.

The next required input is the number of “atom types”.
Right now we only have a single atom, a hydrogen atom, and specify that it will read one atom type from a file called “h.atm”.

Next, the input requires the number of atoms (just one), and then a list of each “atom, type, and position”. The arguments for an atom input are the atom index (the order of the atom in the list, we only have an atom #1 heer), which type (using the order in which the types were listed above, just type #1), and the positions of the atoms in Cartesian coordinates (in bohr).

Important note: The integration box by default (this default can be altered, as we will see later) is centered around the coordinate origin. I.e., (0,0,0) is the center of the integration box, and that is where we put our H atom to keep it as far as possible from all vacuum boundaries. The code detects if atoms or their density cross the cell boundary, giving a warning, or even stopping the calculations with an error message if the atoms are much to close to a vacuum boundary.

For a molecular (dimension=0) calculations, this is the final required input in the setup section. For a periodic calculation (if, for example, we selected dimesion=3 for the system and done a calculation of very diffuse molecular crystal of hydrogen atoms instead of an isolated hydrogen atom), we would have been required to input a Brillouin Zone sampling. However, for molecular calculations this is unnecessary – one has only real integrals, or the “gamma-point”, to worry about.

The final, required statement in the setup phase section is the “end setup phase” line.

Run phase data

The next section begins with the line “run phase” and ends with the line “end run phase”. Input data in this section allows you to modify parameters that affect the run (SCF) phase of the calculation and beyond. In this example, we modify the convergence criterion to be 0.0005. This criterion does NOT refer to the convergence of the energy in the SCF cycle, but rather refers to the maximum change in a Hamiltonian matrix element (integrals between basis functions over potentials). The value specified here will converge the total energy to roughly 1.d-6 Ry for this problem, but this will vary between different systems (e.g., a covalent system with a big energy gap will converge differently than a metallic system). Note that the convergence criterion is an optional parameter, and that the code will provide a default if you do not override it with this input.

do setup
do iters
no force
setup data
notes
 H atom, LDA no-spin
end_notes
dimension of system (0=cluster ... 3=bulk)
 0
primitive lattice vectors
  18.00   0.00   0.00
   0.00  18.00   0.00
   0.00   0.00  18.00
grid dimensions
  60  60  60
atom types
   1
atom file
 h.atm
number of atoms in unit cell
   1
atom, type, position vector (cleaved bulk positions)
   1     1    0.0   0.0   0.0
end setup phase data
run phase input data
convergence criterion
 0.00050
end run phase data

Cluster Molecule Tutorial

Introduction

Units

Input files

Pitfalls – it’s not your usual molecular code

Example: A basic input file – the H atom

Picking a density functional (and spin) – H atom again

Supercells – H2 molecule

Multiple atoms and types – C6H6

Non-orthogonal supercells – C6H6

Scaling coordinates – C4H4

Aliasing atom types – O(CH3)2

Shifting the origin – O(CH3)2

SCF parameters – C4H4

Geometry: structural energy minimization – CH4

Geometry relaxation parameters – O(CH3)2

Symmetry: Point groups

Symmetry: O(CH3)2 — C2v

Symmetry: C6H6 — D6h

Symmetry: CH4 — Td

Symmetry: CO — linear molecules

Dipoles and supercell approximation – OH radical

Charged molecular ions – OH[-]

Labeling atoms

External electric field

Supercells – H₂ molecule

Multiple atoms and types – C₆H₆

Non-orthogonal supercells – C₆H₆

Scaling coordinates – C₄H₄

Aliasing atom types – O(CH₃)₂

Shifting the origin – O(CH₃)₂

SCF parameters – C₄H₄

Geometry: structural energy minimization – CH₄

Geometry relaxation parameters – O(CH₃)₂

Symmetry: O(CH₃)₂ — C_2v

Symmetry: C₆H₆ — D_6h

Symmetry: CH₄ — T_d