Warning: This page uses Cascading Style Sheets. We recommend you upgrade to a current version of your favourite browser.

CrystalPredictor 2.4.3cposs logo
Project background DMACRYS licencing Publications Computational notes

Louise's notes on CrystalPredictor 2.4.3

A note on flexible searches

If your search is flexible, you need to have DIFFERENT ranges for the search and for the grid of LAMS. If, for example, you want a particular degree of freedom to vary between 0 and 120 degrees (as that is the range that you have determined in a torsion angle scan to be applicable for your purpose), then that is the range for the SEARCH, and is what you input in input.in. However, each point on your LAM grid will be valid across a range of angles. Suppose you elect each point to be valid for a range of 30 degrees, the first point in this particular independent degree of freedom will be valid from 0 to 30 degrees, and hence is centred on 15 degrees. So in your molecule.in file, you would have the range for the grid of LAMS running from 15 to 105 degrees, at 30 degree intervals.

Versions available on cposs

We have a few versions available, but you must select the correct version for the work you are doing.

  • Version 2.4.3 works as described below.
  • Version 2.4.3.2_torgrp works for torsion groups. Mostly this is the same as described below, but additional features are described in Torsion Groups.
  • Version 2.4.3.2_torgrpcon works for torsion groups with constraints on additional angles. Get your head around constraints and torsion groups first, and then you should see why this is more complicated.

You can run any version with one flexible and one rigid molecule, but you currently cannot run one molecule with torsion groups and one flexible but without torsion groups. If you find yourself in this situation, see if you can add a torsion group to the second molecule. It will be more computationally expensive, but will enable you to run!

Input files

Any set of input files set up for CrystalPredictor 2.something should work. However, additional lines and sections are included in some later ones.

input.in

Example input.in files can be copied from /home/cposs/CrystalPredictor/example_inputs

***************************************************************************
* CrystalPredictor study on Blind Test molecule XXXI *
***************************************************************************
* Louise Price *
* 19 November 2020 *
***************************************************************************

This is free text, and you can put anything you like.

MOLECULAR TYPES 1
----------------------------------------------------------------------------------------------------------

This tells the code how many types of molecule there are.

TYPE XXXI
1 !Number of this type in asymmetric unit cell
32 !Number of atoms in the molecule
btxxxi_lam_intra !LAM database file
3 !N flexible degrees of freedom (<= those in mol file)
dih7 60 320 !N lines in the format: degree name - lower bound - upper bound
dih8 30 330
dih9 55 135

For each molecule, you need to include this section. You need to give each molecule a name (XXXI in the example), and say how many molecules of this type there are in the asymmetric unit (this is where you would specify a Z'=2 search). Then is the number of atoms in the molecule, and the file with the molecule's information (this used to be called molecule.in, but now we are free to call it whatever we want). Then the number of degrees of freedom and their ranges IN THE SEARCH are given. For a rigid search, put zero here and leave out the lines with the independent degrees of freedom.

For a multicomponent search, you would have two or more sections like this. You can have a line of dashes separating the sections.

----------------------------------------------------------------------------------------------------------
SPACE_GROUPS

P1 P-1 P21 P21/C P21212 P212121 PNA21 PCA21 PBCA PBCN
C2/C CC C2 PC CM P21/M C2/M P2/C C2221 PMN21
CMC21 ABA2 FDD2 IBA2 PNNA PCCN PBCM PNNM PMMN PNMA
CMCM CMCA FDDD IBAM P41 P43 I-4 P4/N P42/N I4/M
I41/A P41212 P43212 P-421C I-42D P31 P32 R3 P-3 R-3
P3121 P3221 R3C R-3C P61 P63 P63/M P213 PA-3 P2221
PBA2

END_SPACE_GROUPS
----------------------------------------------------------------------------------------------------------

This section defines the spacegroups you want to search in. There are some quirks with some versions of the code, where having both P21/C and P21/N makes things go wrong, and if spacegroup number 14 (P21/C) is after position 14 in the list things go wrong.

The example to the left will generate crystal structures in the same proportions as they appear in the CSD, i.e. about 25% in P21/c. If you add UNIFORM after the SPACE_GROUPS keyword (with a space, so "SPACE_GROUPS UNIFORM") then the spacegroups you have selected will be sampled uniformly. Sometimes, we used to search the most common 12 or so spacegroups with the CSD distribution, and then all the less common spacegroups with a uniform distrubution, but sampling fewer crystal structures.

SIMULATION
15.00d0 !rcut in A
12.0d0 !rcut_quin_spl in A
12.0d0 !rcut_elec in A
1.d-10 !Ewald_accur_minim relative accuracy in Ewald summation in minimisation
1.0E-5 ! pressure in Pa
no !stand_hyd_dist =yes standardise hydrogen distances, =no leave unaltered (neutron)
no !short_hyd_dis =yes foreshorten H positions by 0.1A (after standardisation)

This section gives details to CrystalPredictor's minimizer. You shouldn't need to change anything, unless you want to search with pressure or are using the Williams potential.

SEARCH
50.d0 50.d0 50.d0 !ANG_MIN_SRCH in ^o, minimum cell angle
130.d0 130.d0 130.d0 !ANG_MAX_SRCH in ^o, maximum cell angle
3.d0 3.d0 3.d0 !LEN_MIN_SRCH in A, minimum cell length
40.d0 40.d0 40.d0 !LEN_MAX_SRCH in A, maximum cell length
300.d0 !DENSITY_SRCH in kg/m3, minimum cell density
20.0d0 !U_INTER_SRCH in kJ/mol, maximum intermolecular energy
10.d0 !U_INTRA_SRCH in kJ/mol, maximum intramolecular energy
0.15d0 !W_SRCH minimum deformation parameter
2000000 !NO_MINS_MAX maximum number of minimisations
30.d0 !polym_region in kJ/mol region of polymorphism, structures outside rejected

This section gives instructions to the structure generator and makes the decision on whether to keep structures at the end.

The first four rows give the minimum and maximum cell lengths and angles. You could put different values into each angle, but this only determines the random unit cell generated by the Sobol sequence. After the energy minimization, the unit cell is allowed to be outside these ranges. The density is similarly just for the initial unit cell generation, and a greater valiety of densities is achieved after minimization.

Some work can be done on optimizing the cell parameters.

The maximum intermolecular and intramolecular energy, and the polymorphic region determine whether the crystal structure afterminimization should be written to crystals_stable.out, or only to crystals.out.

The maximum number of minimizations is just that - CrystalPredictor will minimize the energy of that many crystal structures, after rejecting any generated structures on density grounds (see above). Set this to 100 or 1000 for initial tests, and then 1,000,000 or whatever for your actual searches.

SMOOTH 0.4

This needs to be included if you want to smooth the steps between LAM points. By default, it will use all points in the LAM database and weight them according to distance, where 0.4 is the weighting factor. Isaac has recommended 0.4.

DOING_CUTOFF 100

This may be added so that the smoothing function only averages over the 100 closest points. Useful if you have a very big LAM file, as this averaging can take a very long time.

NON_UNIFORM

This used to be included if you had a non-uniform LAM. It has been replaced by SMOOTH.

Uintra_cap 4 -416.50670389

Uintra_cap tells the program about how to treat the intramolecular energy. See the instruction manual.

  • Uintra_cap 0 - Uintra is zero at the lowest energy LAM point, and can go negative.
  • Uintra_cap 1 - Uintra is zero at the lowest energy LAM point, and anything below this is also zero.
  • Uintra_cap 2 xxx - xxx is the gas phase energy for this level of theory, and anything down to this is allowed, but anything below is set to zero.

TRACK_MIN

This may be added to input.in when you are running Minimize. It outputs extra information and extra files for each step of the minimization.

potential.in

Example potential.in files can be copied from /home/cposs/CrystalPredictor/example_inputs If you need to add additional terms, you don't have to add cross-terms, as the code generates these itself.

molecule.in

You need an input file for each molecule TYPE, and this must have the name specified in input.in.

Intramolecular energy/gradients/hessian/charges for:

BTXXXI

The heading. This title doesn't even need to match that in input.in.

Generated at level of theory:
      
PBE1PBE/6-31G(d,p)

This is where you specify the level of theory for the calculation of the LAM grid.

Across 3 dimensional grid:

start  interval finish
dih7  80.00    40.00   300.00
dih8  50.00    40.00   310.00
dih9  75.00    40.00   115.00

This is the grid used to calculate the LAMs. It is NOT the same as the grid used in the search. See the note at the top of this page for an explanation.

Constraints: 5
dih16      115.4524
dih37      184.349
dih40     -177.6456
dih43      171.5106
dih46       33.5054

You can add any constraints here, such as methyl rotations. (Nb. methyl rotations take a long time to optimize in GAUSSIAN, so it is much more computationally efficient to constrain them)

From starting Z-matrix:
C1     1   0.312699
C1     2  -0.245655     1 cc2
C1     3   0.623878     2 cc3         1 ccc3
O1     4  -0.217881     3 oc4         2 occ4          1 dih4
N1     5  -0.275660     1 nc5         2 ncc5          3 dih5
S1     6   0.674069     1 sc6         2 scc6          3 dih6
C1     7   0.175987     6 cs7         1 csc7          2 dih7
C1     8   0.047070     7 cc8         6 ccs8          1 dih8
C1     9   0.229882     8 cc9         7 ccc9          6 dih9
C1    10  -0.196865     9 cc10        8 ccc10         7 dih10

Then follows the Z-matrix.

The first column is the potential type, and must match one of the types in the potential.in file.

The second column is the symmetry. This is used later in Analyse to determine which crystal structures are the same. If you have a molecule with little to no flexibility, just give every atom a unique number. If you are using benzene, for example, then each of the carbon atoms would have the SAME number in this column, and each of the hydrogen atoms would have the SAME number.

The third column is the charge on the atom in the gas phase optimized conformation. Each LAM point also includes the point charges, and it is not clear where the actual charges come from. In the past, there was the option of using fixed charges defined in the molecule.in file or variable charges from the LAM points (in version 1.6, this was strictly the intra_1.pot file). Fixed charges was preferred, because the interpolation from the input grid was not correct and large discontinuities were seen. Many recent example files have been provided by IC with all charges in the molecule.in set to 0.0. However, experience in the UCL group has shown that doing this gives very strange results. You should calculate the atomic point charges for the gas phase optimized molecular conformation with the same computational model as is used in the LAM databases, and put those here. We believe that there is a weighting for all the points in the LAM database depending in how far they are from the current conformation, but these charges are also included in the weighted value.

Then follows the Z-matrix. For a flexible search, care needs to be taken when constructing the Z-matrix around the independent degrees of freedom. If a dependent degree of freedom is defined with the same three atoms as one of the independant degrees of freedom, then as the atom of independant degree of freedom is moved through its range to each LAM point, it could come very close to the atom defined by the same three atoms. Therefore, it is necessary to define atoms near the independent degree of freedom by improper dihedrals.

Consider a methyl group, with the independant degree of freedom defined by H1_C3_C2_C1. If H2 is defined as H2_C3_C2_C1, then when H1_C3_C2_C1 is close to the limits of the LAM, it will be very close to the H2 atom. If H2 is defined as H2_C3_C2_H1, then wherever H1 moves to, H2 will follow it. For sp3 hybridization, the other degrees of freedom will be 120 and -120 degrees.

 cc2         1.494525
cc3         1.537641
ccc3         99.165
oc4         1.479437
occ4        103.395
.
dih7         91.332
dih8       -169.680
dih9         85.322

Then follows the starting points for the variables, WITHOUT a blank line between the Z-matrix definition and the variables. The independant degrees of freedom are the last few variables. If you had any Constrained angles, they would come last of all.

####################################################

ENTRY 00001 dih8_   -0.00   dih9_   40.00   dih10_  -65.00

Molecular energy at expansion:  -1174.251402 H.

Charge model at expansion point:

1  C   -0.232754
2  C    0.010318

Then come the LAM points. When you start, just have a blank line after the degrees of freedom section. Follow the instructions in the section below, and this will put all the LAM points in for you.

LAM generation

In the past, LAM generation has been carried out with a variety of codes. First, min_input was used to define the molecule, and min_setup was used to set up the GAUSSIAN runs. Then make_lam was used to make the lam database, and this was then appended to molecule.in. Then scan could be run to analyse the LAM database, and output suitable for plotting in Excel could be generated. Then Cycle (or Cycle_v2.0) could be run to analyse the LAM database, and see where non-uniform LAMs were required. Then more GAUSSIAN jobs could be run, and the final points iterated until you were happy with the uniformity of the LAMs.

Now everything is done with LAM_GENERATOR.

Step 1a

Run LAM_GENERATOR, with ./LAM_GENERATOR

Cycle v2.0, manual input. Answers can be fed in using:
'./cycle < cycle_input'
but successive runs require different answers.
Pop Keyword? (CHELPG, hly, hlygat)
CHELPG
Do you want to use int=ultrafine SCF=QC in the GAUSSIAN job? Y/N
n
Charge? (usually 0)
0
Multiplicity? (usually 1)
1
Number of processors?
4
Accuracy cutoff? (use a decimal point so fortran knows its a float. Usually 1.0)
1.0
Can we assume the U_INTRA_SRCH ( 20.00) is the high energy cutoff? (Usually 20.0) Y/N
y
Would you like to use your own runGAUSSIAN file?
n
Would you like to use the pqchemeng queue?
n
is this the first uniform run? Y/N
y
read in existing LAMS: 112
do you want to set up the directories initially?
y
SETTING UP UNIFORM JOBS. Number of LAMs: 112

The first couple of lines refer to the old name for the program - Cycle. The first run of LAM_GENERATOR creates a file called Default_cycle_answers which successive runs reads in before asking the user for answers to later questions.

At UCL:

  • we usually use CHELPG,
  • we don't use int=untrafine SCF=QC,
  • Charge is usually zero
  • Multiplicity is usually 1
  • Running across a binary number of processors is more efficient for GAUSSIAN

Accuracy cutoff.

This is only relevant for generation of LAM points after the uniform run. As each LAM includes gradients, the energy at areas can be evaluated from each of the surrounding LAMs. For any point where neighbouring LAM predict the energy to differ by more than this Accuracy cutoff, a new LAM point will be generated. This script suggests 1.0 kJ/mol, and this is what we used to use. Isaac suggested 5.0 was adequate if we were using smoothed potentials. Louise is suggesting 1.0 at the moment while she is doing further tests.

High energy cutoff

If subsequent runs of LAM_GENERATOR reveal areas where more LAMs are necessary, but the energy is higher than this cutoff by all estimates so far, they will not be generated. The suggestion is that it is linked to the value you will use in the search, specified in your input.in file. You can use different values at different stages.

The runGAUSSIAN file and the pqchemeng queue questions are not relevant. Answer "n" to both. At UCL we run the jobs in a different manner.

Yes, this will be your first uniform run, and you do want to set up directories.

Step 1b

Now you will have directories with the GAUSSIAN input files for your uniform LAMs. You will notice in each there is a runGAUSSIAN.csh file - this is setup for Imperial College and will not work.

Copy LAMGEN_gauss_Batch.sh from /home/cposs/CrystalPredictor/unix_executable/ and edit it. The 6th line should match the number of processors you want to run each GAUSSIAN job on, as specified to LAM_GENERATOR in the previous part. The 7th line is the range of the new LAM jobs to be calculated. Run this with qsub LAMGEN_gauss_Batch.sh and it will submit as many jobs as there are free nodes with sufficient cores.

DO NOT DO THE NEXT STEP UNTIL ALL THESE JOBS HAVE FINISHED COMPLETELY.

Step 1c

Run LAM_GENERATOR again.

Cycle v2.0, manual input. Answers can be fed in using:
'./cycle < cycle_input'
but successive runs require different answers.
reading Default_cycle_answers
pop:
CHELPG
int=ultra
n
charge: 0
multiplicity: 1
nprocs 4
acutoff: 1.00000000000000
high energy cutoff: 20.0000000000000
not using own runGAUSSIAN.csh
pqchemeng: F
is this the first uniform run? Y/N
y
read in existing LAMS: 112
do you want to set up the directories initially?
n
DOING SUBMISSIONS
What is the amount of jobs you can handle on your system?
10
new_lam_00001/hessian/hess.log exists, doing nothing
new_lam_00002/hessian/hess.log exists, doing nothing
.
new_lam_00111/hessian/hess.log exists, doing nothing
new_lam_00112/hessian/hess.log exists, doing nothing
out of uniform loop
flexible dihedrals called: 1 dih7
flexible dihedrals called: 2 dih8
flexible dihedrals called: 3 dih9
using G09_d01 formatting!
90
.
using G09_d01 formatting!
90
read file list: 112
first_line:
new_lam_00001
last line
new_lam_00112
There are now two LAM databases: btxxxi_lam_intra and new_lam_intra
Renaming btxxxi_lam_intra to old_btxxxi_lam_intra
Catenating old_btxxxi_lam_intra and new_lam_intra to btxxxi_lam_intra

This step will check to see whether any jobs need to be run (i.e. whether there is a hess.fchk file), and if necessary will submit them with the wrong runGAUSSIAN.csh file. If all jobs have completed, and there is a hess.fchk file for each, it will generate the LAM database for you, and append it to your molecule.in file.

Step 2a

Run LAM_GENERATOR again.

Cycle v2.0, manual input. Answers can be fed in using:
'./cycle < cycle_input'
but successive runs require different answers.
reading Default_cycle_answers
pop:
CHELPG
int=ultra
n
charge: 0
multiplicity: 1
nprocs 4
acutoff: 1.00000000000000
high energy cutoff: 20.0000000000000
not using own runGAUSSIAN.csh
pqchemeng: F
is this the first uniform run? Y/N
n
read in existing LAMS: 112
do you want to generate new adaptive LAM directories and gaussian
input files (answer N if these exist and you want to submit jobs)?
y
IT IS! are you sure? it will delete any completed gaussian jobs you have done
y
Would you like to generate LAMs at the edges of the search space if lower than
the cutoff? Only do this once
n
do you want to skip creating new_lams, and just evaluate the egdes of the search space? y/n
n
filename,nprocs,num,pqchemeng
new_lam_00113.com
4 113 F
.
filename,nprocs,num,pqchemeng
new_lam_00419.com
4 419 F
considered 110 LAMS
low energy found at the edges of the search space, consider extending the LAM grid (ignore this warning if you have a 360 degree rotation around this particular d.o.f
These values: 70.00 20.01 75.00 18.41 1 key one 2

This time, you should say that it is NOT the first uniform run. It will check the grid of LAM points to see which additional lam points are required, and make the input files for them. It will also make a file called new_NON_UNIFORM_LAM_RELEVANCE, which you should rename to NON_UNIFORM_LAM_RELEVANCE. At this point, you also need to make sure the NON_UNIFORM keyword is at the end of the input.in file, or the codes will not pick up this NON_UNIFORM_LAM_RELEVANCE file.

Step 2b

Submit the jobs

Edit the LAM point numbers in LAMGEN_gauss_Batch.sh to cover just the newly generated points, and submit these jobs. Wait until they have finished.

Step 2c

Run LAM_GENERATOR again.

Cycle v2.0, manual input. Answers can be fed in using:
'./cycle < cycle_input'
but successive runs require different answers.
reading Default_cycle_answers
pop:
CHELPG
int=ultra
n
charge: 0
multiplicity: 1
nprocs 4
acutoff: 1.00000000000000
high energy cutoff: 20.0000000000000
not using own runGAUSSIAN.csh
pqchemeng: F
is this the first uniform run? Y/N
n
read in existing LAMS: 112
do you want to generate new adaptive LAM directories and gaussian
input files (answer N if these exist and you want to submit jobs)?
n
IT ISN'T!; you entered n
what is the amount of jobs you can handle on your system?
10
new_lam_00001/hessian/hess.log exists, doing nothing
new_lam_00002/hessian/hess.log exists, doing nothing
.
new_lam_00418/hessian/hess.log exists, doing nothing
new_lam_00419/hessian/hess.log exists, doing nothing
out of non-uniform submissions
flexible dihedrals called: 1 dih7
flexible dihedrals called: 2 dih8
flexible dihedrals called: 3 dih9
using G09_d01 formatting!
90
.
using G09_d01 formatting!
90
read file list: 419
first_line:
new_lam_00001

last line
new_lam_00419

There are now two LAM databases: btxxxi_lam_intra and new_lam_intra
Renaming btxxxi_lam_intra to old_btxxxi_lam_intra
Catenating old_btxxxi_lam_intra and new_lam_intra to btxxxi_lam_intra

This will generate the updated lam_intra, and update your molecule.in file. Unfortunately, it will append the full set of LAMs to the file that included the previous LAMs, and so you will need to edit it to fix this. Consider having a master molecule.in file, and just cat this and the new_lam_intra files onto your molecule.in that is being used.

At this point you also need to copy new_NON_UNIFORM_LAM_RELEVANCE to NON_UNIFORM_LAM_RELEVANCE and make sure NON_UNIFORM appears as a keyword at the end of your input.in file.

mv new_NON_UNIFORM_LAM_RELEVANCE NON_UNIFORM_LAM_RELEVANCE

cat master_mol.in new_lam_intra > btxxxi_lam_intra

Step 3 onwards

Now you can just repeat the last three steps until you are happy that your grid of LAMs is as extensive and smooth as necessary. Part a sets up the new points; part b runs the jobs; part c pulls the new points together and creates the new LAM database. As a subsection of part c, you need to check the molecule.in file is correct, and copy the new_NON_UNIFORM_LAM_RELEVANCE.

It is suggested that once during each LAM setup you analyse the points at the edge of the grid. I don't know whether this should be the first step, the last step, or somewhere in between.

Testing CrystalPredictor

Once you have all the input files you need, copy the runscript for the version of CrystalPredictor you require from /home/cposs/CrystalPredictor/runscripts/ Edit the name of the job and the number of processors. Edit your input.in file to request a small number (1000 is suggested) crystal structures. Submit this job.

The files that are produced are:

  • CrystPred_log.out which has a lot of information on the job you are running, and can be used to help troubleshoot if things fail
  • crystals.out which contains ALL the crystal structures generated
  • crystals_stable.out which contains only the successfully minimized structures that are within the energy range of polymorphism as you specified in your input.in, relative to the lowest energy structure generated so far
  • crystals_IFAIL.out which contains some low energy structures (within the energy range of polymorphism) that were not minimized successfully. This can be appended to crystals_stable.out if there are a lot.
  • starting_crystals.out which contains the structures as generated by CrystalPredictor, before the minimizer is run
  • current_restart.in and restart.in which are used if you continue your search after finding it is not complete. They are useful, as they tell you which spacegroups have been considered how many times.
  • global_statistics.out which is useful to tell you how many structures have been minimized so far, and what the lowest energy structure is

Once CrystalPredictor has finished (use wc crystals.out to see how many structures have been generated so far), copy the corresponding runscript for Analyse, and submit that. This will generated Analyse_log.out (described below, since it is not particularly useful at this test stage) and unique_pool with a folder for each crystal structure. Check unique_pool/1-1/1.res to make sure everything is as it should be (particularly molecular connectivity!).

Running CrystalPredictor

Clean up the files produced in the test, and increase the number of crystal structures requested in input.in. Submit the job and run Analyse afterwards as previously.

Checking the output

After you have run Analyse, you will have a file called Analyse_log.out. This gives a lot of information on the crystal structures generated. Open that file, and make the following checks.

  • Search for GLOBAL. Around this point in the file it will tell you how many stable structures were generated (that is structures below the threshold you specified in input.in). If this is far short of the number of structures you requested, think about whether something has gone wrong.
    • Did you have a large number of IFAIL=6 errors? These are written out to crystals_ifail6.out, which can be appended to crystals_stable.out before rerunning Analyse.
    • Was your threshold for polymorphism too small for the system? Salts have far higher intermolecular energies because of the electrostatic contribution, for example. Read Troubleshooting CrystalPredictor to see how to edit your crystals.out to make a crystals_stable.out file and rerun Analyse.
    • Were there any other problems with crystal structures? Again, Troubleshooting CrystalPredictor is a good place to start. Look at torsion angle boundaries, density (consider adding pressure), and structures which crossed too many LAM boundaries. You can edit crystals_stable.out to include these sorts of problems.
  • Shortly after GLOBAL, you will see how many unique structures were found within different energy cutoffs.
  • For each crystal structure ("Cluster") you will have details of the energy, number of times it was found, intermolecular contacts, closest matching structure and details of two of the structures which were included in this cluster.
    • Check that all low energy structures were found a reasonable number of times. We would suggest that the 20 lowest energy structures were found at least 20 times. However, triclinic structures are frequently not matched properly, so if you have triclinic structures which were only found once but have the same energy and have a closest matching cluster with RMS ~1E-002, then you can consider them to be the same cluster.

Running Minimise

Minimise is useful to find where a structure of interest (such as an experimentally observed structure) would have come in your search, so you can find it in the search output. It is probably the most difficult step of running CrystalPredictor, so keep your wits about you, drink plenty of tea, and be prepared to fail. When you fail, check everything again and again, and then rerun it. Then when you fail for the second time, recheck everything again and again and try again. If you are intent on running it yourself, please do not ask Louise until you have tried about 4 times. She will only do it all from scratch anyway.

You will need an empty directory with the following files

  • input.in
  • molecule.in
  • potential.in
  • expcrys.pdb

    Follow these steps exactly for the best results

    • Open two windows with Mercury running in each, side-by-side if you can
    • In one window, put the structure you are interested in, and in the other window put unique_pool/1-1/1.res since you know the atomic numbering on this file is consistent with your molecule.in file.
    • Turn on atom labels in both windows
    • In the window with the experimental crystal structure, use Edit > Edit structure... to open the Edit structure window.
    • Click on the "Set atom label" button (in the middle of the window).
    • Relabel each atom in the experimental crystal structure to be exactly the same as the structure generated in the search.
    • Save the file as something_renumbered.res (This can be quite useful at many points in your work!)
    • In Notepad or Vi, edit something_renumbered.res so that the atoms are in the same order as your molecule.in (they currently have the correct atom labels, but the atoms will not be in order, and your Z-matrix definition will probably have hetero atoms mixed in with the carbons). Hydrogens are less important (we think), unless they are in key groups such as flexible hydroxyl.
    • Reopen something_renumbered.res in Mercury and save it as expcrys.pbd
  • Minimise executable corresponding to the version of the code you are using. (If you have a runscript, look at the path to CrystPred in that.)

You will also need access to the NAG libraries. If you have a runscript, there will be a line that looks like

export NAG_KUSARI_FILE=/opt/nag/nll6i27dbl/license/license.lic

which you should copy and paste. You may also need to load the nag module with module load nag although you can also add this line to your .bashrc file.

When you have done all that, run ./Minimise > Minimize.log

If everything has gone well, the end of Minimize.log will say Exit e04uf - Optimal solution found.. In this case, your Minimisation.log file will have a lot of useful information. You will find:

  • Lots of information about the CrystalPredictor run, similar to CrystPred_log.out.
  • The torsion angles in the expcrys.pdb structure
  • The RMS OVERLAP ACHIEVED
  • The distances between experimental and pasted atoms, so you can quickly see if you have done anything wrong.
  • The initial Energy and Density
  • Any problems with minimisation
  • The final Energy and Density
  • Overlap between starting and final crystal structures, and the distances moved by each atom
  • The line that would have appeared in crystals_stable.out, if this structure was included.

Some of the problems are that angles start outside the boundaries, or hit boundaries during the minimization. Sometimes, the minimizer gets upset, and you get an IFAIL 6 error. Sometimes it says that torsion angles have reached the bounds when they have not, and the minimizer has stopped working (IFAIL 6).

As well as knowing the final energy and having the line that would be in crystals_stable.out (which will help you find something with the correct energy), you will also have minimise_final_structure.pdb which has the minimized structure in P1. You can open this in Mercury, save to .res, and inspect with Platon to get the full structure including symmetry.

© UCL Chemistry Department 2022. This page was last updated on 17 August, 2022. If you have any problems with this page please email the WebMaster