Control and Prediction of the Organic Solid State: 2 Proposed Research
As organic molecules are being developed for novel materials and pharmaceuticals, the characterisation of their solid state properties is revealing a paradigm shift in our understanding of their crystallisation. Instead of a molecule having a unique crystal structure, it often appears as if the number of known solid forms (polymorphs) is proportional to the time and money spent investigating the compound. Polymorphism appears to be far more prevalent than expected from the decades when the structure of the first sample that could be prepared suitable for X-ray diffraction was viewed as the crystal structure, and the observation of polymorphism was an occasional curiosity. The unexpected appearance of novel crystalline forms can be a scientific, industrial or commercial disaster, which has been likened, as in the case of retonavir (Abbott Laboratoriesí HIV drug Norvir) to a hurricane in its unpredictability. In contrast the controlled use of a metastable form can offer considerable benefits when it has advantageous physical properties such as improved solubility. Exploitation of specific polymorphs requires a predictive model of how the kinetic aspects of solvents and crystallisation conditions determine which of the thermodynamically feasible crystalline forms are actually observed, built and validated by additional experimentation. The vision of this project is to establish the technology to allow routine prediction and control of the organic solid state.
Aims and approach
This proposal addresses the challenge of controlling polymorphism by building the basic technology for surveying the potential polymorphism of organic molecules both computationally and experimentally. It is ultimately aimed at producing computational methods for predicting the polymorphs of a molecule before it has been synthesised, which can guide the control of which crystal structure is produced, for example, by solvent or salt selection. The methods will be derived from a combined theoretical and experimental approach for an initial set of model compounds, for which we find all possible polymorphs and solid state forms (amorphous/solvates), and their key properties, and use this information to develop and validate computational models for their prediction.
The approach recognises a spectrum of polymorphic behaviour. At one extreme is the traditional scenario in which one crystal structure is sufficiently thermodynamically favoured and kinetically accessible that it is the only one that will be observed. Such behaviour should be readily predicted by the extension of current theoretical methods of searching for the lowest energy crystal structure. In the middle of the spectrum, the search will identify a limited number of thermodynamically feasible crystal structures, and so the kinetics of nucleation and crystal growth in a variety of solvents along with solution-mediated and solid state transformations determine which metastable structures are observed. In this situation, which applies to the majority of molecules subject to lattice energy searches to date, polymorphism should be predictable and ultimately controllable with the application of appropriate computational models for the kinetic constraints developed in this project. At the far extreme, molecular structure and intermolecular forces combine to provide energetically equivalent packing motifs, such as hydrogen-bonded layers capable of stacking in different yet equi-energetic ways, allowing an almost infinite variety of polymorphs to be formed. We will establish that this form of polymorphism is exceptional, and predictable, providing a rational basis for the avoidance of such forms during product development.
Developing such an over-arching predictive computational model requires the detailed atomistic understanding of polymorphism in a wide range of model molecules. This has been previously unavailable, as it requires a co-ordinated study combining the development of novel practical approaches and the utilisation of state-of-the art experimental techniques. The model compounds used in the development of the computational techniques will be selected to be representative of typical pharmaceuticals, non-linear optical materials and pigments, developing to the entire range of speciality organic products and so ensuring wide application of the resulting technology. Thus, the impact of the proposed technology will be to control the unwanted appearance of polymorphism and to exploit its benefits in the development, manufacture and processing of new molecular materials. The closely inter-linked objectives are:
The core objective of the proposal is the development of an organic polymorph database as an expanding searchable repository of theoretical and experimental data relating to the thermodynamic and kinetic aspects of nucleation, crystallisation and transformations of hundreds of organic molecules. We expect to develop this database, so that by the end of the project, the computational procedure applied to a new molecule will predict which polymorphs are likely to be observed, and their properties.
Polymorph screening and characterisation
Crystallisation laboratories with expertise in different types of molecules will develop innovative crystallisation approaches, to produce a completeness of polymorph screening which far exceeds the pragmatic range of conditions typically used in industrial screening. Pharmaceutical Sciences, Strathclyde (Florence and Shankland) will realise a novel integrated polymorph screening and physical analysis system, which will overcome the data-quality limitations of currently fashionable "fingerprinting" approaches to solid-state combinatorial screening. This will be more inclusive, systematic and reproducible than the prohibitively time-consuming "manual" means of exploring critical crystallisation parameters such as solvent choice, supersaturation, agitation, temperature and rate of cooling. The screening system will include: (a) high-throughput automated screening based on a Gilson robotic platform to include a bespoke combination of elements capable of producing over 140 polycrystalline samples per week via controlled cooling, solvent quenching or evaporation; (b) high-throughput, high-resolution X-ray powder diffraction, specified to measure 20 samples a day to allow rapid identification of all physical forms and mixtures emerging from the crystallisation robot, and provide data of sufficient resolution that structure determination should be possible when the data resolution are not limited by sample effects; (c) simultaneous thermal analysis (differential scanning calorimetry and thermal gravimetric analysis), multi-temperature diffuse reflectance FT-IR spectroscopy, multi-temperature XRPD and hot-stage polarising microscopy to determine the structural and thermodynamic relationships between the different physical forms. This step will provide preliminary data for the neutron-diffraction studies of thermal transformations (David). (d) An optimisation and scale-up platform for polymorph growth to include continuous solubility determination and a state-of-art particle sizing (FBRM) probe to monitor the crystallisation in situ and track the appearance and disappearance of metastable forms with respect to time, temperature, solvent and supersaturation. This will be invaluable in informing the NMR experiments (Harris). The experimental parameters and results from stages (a) to (d) will be deposited directly in the database, as a means of optimal dissemination within the project.
In parallel with these developments, UCL (Tocher and Price) will co-ordinate the study of each molecule, producing the theoretical description of the computed polymorphs and the single crystal samples for structure determination, including the relatively large crystals required for single crystal neutron diffraction studies (Wilson). In producing the isotopically substituted samples required for some neutron powder diffraction (David) and solid-state NMR work (Harris), observations will be made on whether there is a detectable effect of the different mass on the crystallisation and morphology. Single crystal diffraction using our state-of-the-art SMART APEX CCD diffractometer and low temperature facility (100-300K) will be used in structure determination, and low resolution electron microscopy for morphologies. A laminar flow cabinet that provides a clean facility and high purity materials will be used as necessary. By the rapid passing of information but not nucleation seeds between Strathclyde and UCL, seeding and impurity effects can be detected and investigated, thereby illuminating the problems of irreproducibility and "disappearing" polymorphs which have plagued this field.
Solid state transformations and dynamics
Definitive and complete characterisation of many crystal structures, including precise hydrogen atom location from hydrogenous samples, through the range 4-300K and pressures up to 5 kbar, will be provided by single crystal neutron diffraction (Wilson). Recently dramatically enhanced facilities in this area at CCLRC-ISIS (SXD-11) and at the ILL, Grenoble (VIVALDI), and recent advances in pressure cells, will allow the determination of proton positions and dynamics very rapidly for crystals of 1mm3, and permit study of smaller crystals. We will scan an even wider range of temperatures and pressures to seek and characterise transformations between polymorphs using high-resolution neutron powder diffraction (David), utilising the high-pressure beam-line and the world-leading suite of neutron powder diffractometers at CCLRC-ISIS. David has already used this facility to watch transformations in MoO2 and C60 occur, allowing kinetic studies, and he will innovate such studies on organic transformations. The neutron powder studies require samples with only a small proportion of 1H atoms, and we will carefully monitor the effect of deuteration on the polymorphs and transition conditions found as providing unique insight into kinetic isotope effects. This precise atomistic definition of the structures and transformations will be essential to develop the computer modelling of crystal dynamics (Leslie & Price) so that the likely transformations or metastability of crystal structures can be predicted. The location of hydrogen atoms, and determination of unusual hydrogen motions from neutron diffraction is essential, as the predicted hydrogen bonding and hydrophobic interactions are very sensitive to these positions, yet they are only approximately known from the conventional X-ray studies. When the diffraction studies suggest that there is disorder in the hydrogen bonding, solid state 2H NMR (Harris) will be used to establish the rates and mechanisms of the motion in dynamic hydrogen bonding arrangements (c.f. ). For such studies, materials selectively deuterated in the hydrogen bonding sites are required. Thus, the time-averaged structural properties seen in diffraction studies, can be interpreted as time-dependent (dynamic) properties by 2H NMR.
The teamsí internationally leading expertise in structural characterisation from single crystal and powder diffraction will allow unprecedented accuracy in the determination of crystal structures. However the discovery of polymorphs that are beyond our ability to characterise, through complexity, metastability or inability to produce suitable crystals is a possibility, and the discovery of solid forms which are beyond industrial characterisation a near certainty. Hence, a student under the supervision of Tremayne, will develop tools to use the theoretical predicted structures to allow structure determination by refinement of unindexable powder data, or the interpretation of more complex structures.
Tera-scale computing will be used by Catlow to simulate the formation of nuclei of the organic solid from the solution by long-timescale Molecular Dynamics simulations. Tera-scale computing will allow simulation of 10 ns or larger simulation boxes allowing the detailed mechanism of nucleation to be identified, as has recently been achieved in the RI group in studying the nucleation of ZnS. Advanced visualisation techniques, available in the new virtual reality centre in the Chemistry Department of UCL, will also be of value in interpreting the results. Closely coupled to these expensive full simulations will be shorter MD simulations probing the stability of hypothetical nuclei, constructed from the known and hypothetical structures. These, in turn, will be used to develop methods of predicting gross relative nucleation rates as a function of supersaturation for different (hypothetical) polymorphs. The use of computer simulations to generate the interfacial energy between the differing crystal faces and the solvent, and classical nucleation theory, will provide data pertinent to nucleation rates that can be readily calculated and included in the database.
We will pioneer an in situ experimental approach, employing solid state NMR, for studying crystal nucleation processes. The difference between the relaxation times of a solute molecule in solution and in a solid particle will provide an opportunity to observe selectively the first solid particles produced during the crystallization process, even though only a small amount of the solute species may actually have crystallized out of solution.
An important aspect of the NMR analysis will be to establish correlations between the solid state NMR spectra and features of the crystal structures of the polymorphs, allowing us subsequently to derive information on the structural features of the solid particles produced during nucleation. Thus, in addition to in situ studies of the crystal nucleation processes, comprehensive solid state NMR characterization of all known polymorphs of the molecules of interest will be carried out (on samples provided by UCL) to establish the required correlations between spectral and structural features. We will initially study systems where there are very different hydrogen bonding patterns in the polymorphs formed from different solvents, to maximise the expected variations in the NMR spectra. The full structural understanding of the nucleation entities will require combined interpretation of the NMR results with computer simulations of nucleation in the same solute/solvent systems.
Computational Prediction of Possible Polymorphs
Computational methods of finding the energetically feasible crystal structures of a rigid organic molecule will be developed to apply to a wider range of molecules, including salts and solvates, and conformational polymorphism, and to consider a wider range of types of crystal structure. Computer science developments, and eventually Grid technology, will be used to overcome current computational limitations on the range of the search and the analysis of the results, to enable these programming (Leslie) and scientific developments (Price). The new accurate atomic locations in the crystal structures and the data on thermal expansion will be exploited to increase the accuracy of the modelling, by combining Molecular Dynamics simulations of the crystalline phases and lattice energy minimisation results.
Searches for the global minimum in the lattice energy often show that there are many more energetically feasible structures (i.e. within the energy range of possible polymorphism of ~ 10 kJ/mol) than known polymorphs. We already have examples in pyridine, paracetamol  and chlorothalonil , where such predictions have led to the discovery of new polymorphs, demonstrating the necessity for an extensive polymorph screen to establish which metastable structures can exist under what conditions. However, when there are apparently thermodynamically plausible crystal structures that cannot be observed, we need to develop the computational model to reflect the kinetic and subtler thermodynamic factors that determine which structures will be observed polymorphs. We have already observed that the predicted mechanical properties and morphologies of some of these hypothetical structures have eliminated some as possible polymorphs. Indeed crude estimates of the relative growth rates of the crystallites in vapour seem to favour the observed structures. Nevertheless, it seems probable that the kinetics of nucleation and growth in different solvents, and the dynamics of solid-state transformations will play a major role in polymorph control, and hence the project will develop computational models for these effects (Catlow, Price) requiring the experimental results for validation. Thus, this project will considerably expand the range and accuracy of calculated properties of the hypothetical crystal structures, covering kinetic and solvent effects, for each molecule in the database. This should ensure that the data-mining analysis could find the most effective correlation between the observed polymorphs and the predicted thermodynamic and kinetic properties.
Programme of research
The project will develop a database that will contain well over one hundred molecules, including some examples of conformational polymorphs, molecular salts and complexes (including stochiometric hydrates), covering the range of functional groups found in industrially important organic molecular materials. For each molecule, the database will contain the energetically feasible crystal structures found in an extensive computational search, along with the associated predictions of thermodynamic and mechanical properties, spectra, morphologies, relative nucleation and growth rates along with solvent dependent parameters. For the majority of these molecules, the linked experimental data will be obtained from collaborators and the literature, and will comprise the known polymorphs, and their known physiochemical properties and growth conditions. The most suitable experimental data in the literature will be the heavily studied systems, such as potential non-linear optical materials (e.g. nitroanilines) and organic conductors (tetrathiofulvene complexes), where a wide range of crystallisation conditions will have been investigated in the search for optimal properties.
The complete polymorphic screening and characterisation studies will be carried out on ca. 50 molecules, including nucleic acid bases, amino acids, quinacridone derivatives, cinnamic acids and a wide range of small organic model compounds containing important functional groups. The database will contain all polymorphs found, and most importantly, the range of solvents and thermodynamic conditions under which each was formed (noting conditions that lead to mixtures) and the resultant morphologies. This data will be essential for the development of methods of predicting the relative nucleation and growth rates (hence morphologies) of the hypothetical crystal structures as a function of solvent. The monitoring of in situ particle sizes, and the evolution of metastable phases during growth, will provide the preliminary evidence to determine the systems most amenable to NMR monitoring of nucleation, and therefore simple fluorinated compounds will be the first targets. The lab thermal analyses and optical microscopy will provide details on thermal transformations as a prelude to the facilities determination of structure and temperature/pressure dependent transitions. The FT-IR spectra and PXRD patterns for each polymorph will be included in the database.
The definitive determination of structures as a function of pressure and temperature by neutron diffraction will cover up to ten molecules, though quick neutron powder scans on hydrogenous samples will be done to fully characterise all temperature dependent transformations seen in the lab thermal analyses. The detailed study of these solid state transformations will be coupled with the Molecular Dynamics simulations, and estimated phonon frequencies, to derive a method for estimating which hypothetical structures would thermally transform to a more stable structure and are unlikely to be observed as long-lived metastable polymorphs. For most of the project, molecules will be studied both computationally and by crystallisation screening, prior to choosing representatives suitable for the NMR and neutron diffraction experiments. The latter will cover a range of organic compounds and observed types of polymorphic behaviour, with indigo, caffeine and o-acetamidobenzamide included as early candidates.
The database will be developed and updated on an ongoing-basis, with a hierarchy of versions, co-ordinated by Price. Each entry will comprise the theoretical data generated by the predictive models (produced by the London research workers), alongside either external experimental data or the extensive and systematic range of experimental and structural data generated for all of the projectís experimental targets. There will be updates, as the theoretical data is improved in the light of more accurate structural data, and as the experimental observations of new polymorphs accumulate. For example, the solution of some structures from difficult diffraction data may only be possible once the methodology for using the predicted structures has been developed (Tremayne). A key milestone, at the end of year three, will be the validation of the computational models for determining which crystal structures are unlikely to be observed, as the outcome of the experimental and theoretical work on phase transformations and solvent effects on kinetics. These models will then be used to calculate the required parameters for all the known and hypothetical structures in the database. In year four, data-mining will determine how effectively these parameters can be used to predict which thermodynamically accessible polymorphs are kinetically favoured and therefore likely to be observed polymorphs.
Justification of Resources and Management.
It is essential for the development of any reliable technology for polymorph prediction to know all the possible polymorphs of a sufficient range of molecules, and for polymorph and morphology control, to know the effects of a range of crystallisation conditions. Existing experimental studies are either compound or technique-based and the required information is just not available: reports usually include only the optimal growth conditions for new polymorphs, contain no null results, and some property measurements are even unclear as to which polymorph has been used. Hence the need for such a large-scale coordinated project, to provide the new approach to solving the problem of polymorphism. The proposed polymorph screening facility will be unique in academia, and its full-time operation is essential to carry out sufficient crystallisation experiments and the required analyses on a sufficient number of diverse molecules to provide an experimental reference within the time-frame of the project. Its location at Pharmaceutical Sciences, Strathclyde ensures the viability of the proposed approach based on the groupís expertise in polymorphism and crystallisation which has been developed over ten years, and will enable the definition of new standards relative to the polymorph screens currently being developed in the pharmaceutical industry. Since the polymorph screening research spans the interests of the Research Councils, we are clarifying the VAT position with HM Customs, (who have not been able to give a definitive ruling to date), and have included VAT as a precautionary measure. The definitive studies of the structures and transformations will require approximately 200 days of neutron beam time, and our requirements are known at the ISIS facility and will be applied for through the usual access mechanisms. We note that Wilson and David have an exceptionally high (>95%) success in beam-time applications and are substantial users of ISIS and ILL. The computer simulations will exploit the power of the HPC(X) facility for large, long-timescale simulations. A substantial allocation of computer resources will be necessary which we estimate as 3.5M PE based on our recent experience.
The main requirement is for manpower, with postdoctoral assistants allocated to the various sub-projects as indicated. An experienced PDRA in facilities diffraction, Dr Broder, is available for the project, and hence the appropriate salary is requested. The sub-projects that provide excellent postgraduate training, and require less specific expertise, will be done by postgraduate students. The tracking of the progress on the model systems, exchange of samples, database entry and correspondence with collaborators would be performed by a Research Officer, who would therefore be working as an administrative assistant to the project manager Price. Priceís Head of Department has agreed that if this project is funded, her teaching and administrative loads would be sufficiently reduced that the project would be her main activity.
Relevance to Beneficiaries
This research would benefit the development and manufacture of all commercial molecular materials because of the opportunities and risks posed by the potential polymorphism of a product. Hence the collaboration from the pharmaceutical (AstraZeneca, GSK) and speciality chemicals industries (Avecia), and letter of support from QinetiQ (MOD) for the relevance to energetic materials. The provision of data, samples and expertise by our collaborators as an indirect contribution cannot be meaningfully valued. However, the value of the experimental polymorph screening data that we will generate and that our collaborators will provide, should be seen in the light of the estimated $100,000-$200,000 cost of a polymorph screen for a single compound being quoted by the emerging polymorph screening consultancies.
The database technology will be a significant novel research tool, independent of the increased reliability of polymorph prediction. The public availability of polymorph screening data, and computed possible structures and properties, would be used for a variety of academic research, including the characterisation of new crystalline forms from independently inadequate experimental data. This is based on the considerable external use that has already been made of Price's predicted crystal structures. Since each facet of the project innovates the current "state of the art" molecular materials applications, the research will also advance the fields of diffraction physics, nuclear magnetic resonance, atomistic computer simulation as well as the "art" of crystallisation.
One major outcome would be the expanding database as an academic research facility, somewhat along the lines of the establishment of the Cambridge Crystallographic Structural Database. It will be closely linked to the computational suite of programmes, and the IPR of this would be protected to allow academic use and commercial industrial use for the secure generation of the theoretical data on developmental target molecules. It is anticipated that the novel methods or processes developed during the research will be protected to allow commercialisation, with the IPR residing with those who generated it. UCL has a dedicated unit (UCL Ventures) which handles the protection of developed IPR, and would liase with the IPR offices at Strathclyde and Birmingham. The new experimental methods will be reported through academic journals and conferences, and there is the potential that individual techniques and facilities will become established for industrial use in polymorph screening. The dissemination would be through publications, meetings, the European Polymorphism Network, the CCLRC Centre for Molecular Structure and Dynamics, and through industrial collaboration. The completed technology would be a computational facility for predicting the likely polymorphic forms and properties of molecular materials, and novel techniques for high-throughput screening and characterisation of polymorphs, which will be of wide ranging utility in academia and in product development and control in industry. It would provide a basic technology for the organic solid state, analogous to the way in which ab initio electronic structure calculations are now the basic technology for predicting the structure and properties of isolated molecules, prior to expensive experimental studies on molecules of technological promise.