Quantum Chemistry Interface

Back to Main Menu

Quantum Chemistry Interface

Quantics is interfaced to a number of quantum chemistry (QC) programs to provide the potential energy surfaces (PES) during direct dynamics simulations. The general scheme is centred on the database (DB) of points, energies and gradients used to provide the PES at any geometry via interpolation. The scheme to provide the potential expansion up to second order at a point Q (e.g. the centre of a Gaussian wavepacket) is as follows
  1. Quantics inquires whether there are suitable points in the DB, i.e. at least 1 point is within a specified distance from Q .
  2. If a suitable point found use DB data
    • Interpolation carried out and potential expansion returned
  3. If NO suitable point found call QC to calculate new point
    • Pass geometry and set up input file
    • Run Quantum Chemistry program
    • Read output files and add new data to DB
    • Interpolation using DB carried out and potential expansion returned
All the code relating to the QC interface is in opfuncs/funcqchemmod.F90 . The main routines are:
defqchem Define the code to be used along with type of calculation and any options
vqchem calculate the potential energy at a point using the DB
lhaqchem call a QC program and calculate the LHA (local harmonic approximate), i.e. the energy, gradient and Hessian. Calls the relevant ddXXX routine.
ddXXX there are a set of routines that control the QC calculation. This is done by calls from ddXXX to wrYYY and rdYYY routines that write the input files and read the output files for a particular code and method. For example to run a Gaussian calculation ddgaussian calls wrgaussian followed by rdgaussian (for HF and DFT) or rdgaussiancas (for CAS)

Adding a New Code

The following changes need to be made in funcqchemmod.f90
  1. In defqchem add a new entry to define ifunc for the new program
          if (qcprog(1:ilbl) .eq. 'gaussian') then
             ifunc = 1
          else if (qcprog(1:ilbl) .eq. 'qommma') then
             ifunc = 2
     .....
          else if (qcprog(1:ilbl) .eq. 'newprog') then
             ifunc = X
          endif
    
    where newprog is the keyword to be given in the dirdyn-section of the input file as an arguent to qcprogram=newprog and X is the next available number.
  2. In lhaqchem add a call to the routines for the new interface
          select case ( hopilab1)
    
          case default
    
             routine = 'lhaqchem'
             write(message,'(a,i5)') &
                  'Quantum chemistry program not found :',hopilab1
             call errormsg
    
          case (1)
             call ddgaussian(time,xgpoint,v,deriv1,deriv2,dipole,mocoeff,nact,&
                  hopipar1,lfail,outfile,nthreads)
          case (2)
             call ddqommma(xgpoint,v,deriv1,deriv2,hopipar1,nddstate,lfail)
    ...
          case (X)
             call ddnewprog(xgpoint,v,deriv1,deriv2,hopipar1,nddstate,lfail)
    
          end select
    
  3. copy ddgaussian, wrgaussian and rdgaussian to new routines appropriate to writing the input, running the program and reading the output.

The SQL Database

Motivation

The database holds the data for a direct dynamics run. Each 'row' in the database holds one geometry and one of each of the present quantities, such as gradient, energy, molecular orbital coefficients. Previously the data was stored in a variety of binary Fortran files, such as geo.db, pes.db, and so on. This presented two main problems. If multiple invocations of the code were using the database and one was writing to it then corruption would eventually occur. The other issue was that to find a particular record in the database, each preceding record had to be read and discarded. Quite a bit of the database code (the in-memory databases, etc) exists to try to mitigate the second problem.

The concurrency problem can only be solved with some form of locking. The lookup problem can be solved with indexes. Rather than trying to write safe locking or indexing in Fortran, it is much better to use an existing database solution. The simplest that meets these needs is SQLite. SQLite is a library that stores data in a database file and provides the functionality of an SQL engine to all processes accessing that file. We don't (yet?) need the full functionality of SQL just the implicit locking and indexing. It runs on Mac, Windows, and Linux. It is public domain licensed, and is usually installed by default on all Linux distributions.

Implementation

As SQLite is a C code, a wrapper library is needed. This was taken from "flibs", which provides a Fortran interface to the C code. It also requires the sqlite3.h file to be installed. This is available from the "sqlite webpage", or can be be installed via a package manager, for example apt-get install libsqlite3-dev on debian, yum install libsqlite3x-devel on Redhat or zypper install sqlite3-devel on Suse.

In quantics, where possible, the non-DB code has been left unaltered. The initial goal was to get the SQL DB working, before moving to refactoring the rest of the code to use the DB more efficiently. There's a new library in lib/db_sql which contains the C and Fortran code interfacing to libsqlite, and an assortment of routines which are intended to be the only means of accessing the DB.

The routines in analyse/ that use the DB have been altered to use the SQL format. The assumption is that the "at-rest" state of the database should probably be the SQL format. Its file format is platform independent (32 vs 64 bit, big-endian vs little-endian); also, if more than one instance of quantics is reading the same database, one of them finishing and converting the DB to ascii will interrupt the others.

Reading and Writing DB Quantities

The DB quantities are written and read with dbrdXXX and dbwrXXX subroutines (XXX might be geo, pes, etc). The call looks like: call dbgetXXX( db, id, ...) where db is the (already opened) database structure (use sqlite_types) and id is the row number of the entry to be retrieved. All retrievals by id are constant time (independent of the number of rows in the database).

Writing a set of quantities into the DB is a little more involved. First, a transaction must be started. Transactions are processed by databases atomically—that is they either happen or they don't. Failure during a transaction causes the transaction to be rolled back. If a new row is being created then a call to dbwrentry( db, id, ...) must first be done, with id set to zero. This will return the new row number in id. Then write the necessary data with dbwrXXX routines. Finally, commit the transaction (or abort it, rolling the database back to how it was before the transaction started). The transaction should be open for as short as time as possible, as the database will be locked to other processes until it finishes. Writing a quantity takes (pretty much) constant time.

Analysis Routines

In analyse/ two new conversion programs, convdbtosql and convdbfromsql are provided to convert the SQL DB into and from ASCII format. The existing convdb program remains in case users have old binary format DBs they wish to convert. It should be noted that converting the DB to ASCII at the end of a quantics run could interfere with any other quantics invocations that are using that DB. Given the nature and ease of use of SQL, keeping the DB in the SQL format and only converting to ASCII when needed is a good plan.

SQL schema versions

Currently there are two SQL schema versions. The original (1) has one SQL schema for Quantics as a whole. This has been updated to version 2. The main difference is that in version 2 the schema changes based on the number of atoms and electronic states in the caculation. This has the advantage of having each row in the data tables been a single entry, simplifying access. The change has been made so that the differences between versions are contained in the dbrd and dbwr routines. Therefore old and new databases can all be accessed with the same routines.

Database Tables

The following tables could be found in the SQL database. Note there are some differences in layout between database versions, however, the murky details of this are abstracted within the functions used to read and write to the tables.

The Routines

The following routines are used to setup and examine the database. They might work differently depending on SQL schema version but will return the same results

Adding a new Table

If a new set of data is to be added to the DB, a new table is required. To add this, the following steps must be done. We will add a table newtable which contains the real arrays data(dim1,dim2). The logical lnewdb is set when this data is being included and dbnewdata(dim1,dim2,irec) is the memory array holding the data. dbnewdata and must be set up in include/dirdyn.f90.

Schema Version 2

  1. Edit dbgetentries (lib/db_sql/db_sql.F90)
  2. Add a new entry (change "newtable" and "lnewdb". Give db name in place of "NEW TABLE").
      test_string = 'newtable'
      call checktable(test_string ,table_list,lnewdb,"NEW TABLE",iunit)
    

  3. Create a new create subroutine
  4.  
        subroutine crenewtable(db,sql_vers,cmd)
             type(SQLITE_DATABASE), intent(inout)        :: db
             integer(long), intent(in)                   :: sql_vers  !SQL database version
             character(len=1000), intent(out)            :: cmd
    
             if (sql_vers .gt. 1) then
             !make pes table
             !Contains the PES energies for each DB entry
             !Size, a row for DB entry
             !Column number changes based on number of electronic states
             !Have a column for each state and the coupling between them
             !eng_i_j -   the energy for states i & j (only i & j stored not j & i)
             cmd = "CREATE TABLE newtable (&
                         &id INTEGER NOT NULL &
                         &REFERENCES dbentry(id)&
                         &ON UPDATE CASCADE ON DELETE CASCADE,&
         &dimension1 INTEGER NOT NULL,&       ! change dimension1 for e.g. ndof
         &dimension2 INTEGER NOT NULL,&       ! change dimension2 or delete
         &value DOUBLE NOT NULL)'             ! change DOUBLE if not a real
         
             cmd = trim(cmd) // ")"
    
          elseif (sql_vers .eq. 1) then
          !include this to make sure table is only created in correct version
          endif
          call sqlite3_do( db, cmd )
       

  5. Edit dbinit (lib/db_sql/db_sql.F90).
  6. Add call to the crenetwble subroutine. Use the flagarray to make sure it is only created if needed. Use infoarray to pass any size parameters to table creation

Schema Version 1

1. Edit schema.sql (lib/db_sql)

Add an entry such as (editing for correct names / dimensions / type).

/* Map of symmetry replicas */
CREATE TABLE newtable (id INTEGER NOT NULL REFERENCES dbentry(id)ON UPDATE 
CASCADE ON DELETE CASCADE,dim1 INTEGER NOT NULL,dim2 INTEGER NOT NULL, value DOUBLE NOT NULL);

2. Edit dbgetentries (lib/db_sql/db_sql.F90)

Add a new entry (change "newtable" and "lnewdb". Give db name in place of "NEW TABLE").
      call sqlite3_prepare( db, 'SELECT count(distinct(id)) FROM newtable', stmt,&
                            cols_p)
      call sqlite3_next_row( stmt, cols_p, finished )
      lnewdb = cols_p(1)%int_value > 0                        
      if (db%error.ne.SQLITE_OK) then
         write (iunit,'(a20,a15)') ' NEW TABLE      ','Not Present    '
      else if (lnewdb) then
         write (iunit,'(a20,a15)') ' NEW TABLE      ','Included       '
      else if (.not.lnewdb) then
         write (iunit,'(a20,a15)') ' NEW TABLE      ','Not Included   '
      endif

3. Edit dbinit (lib/db_sql/db_sql.F90)

Add a new entry (change "newtable" for table name)

      cmd = 'CREATE TABLE newtable (id INTEGER NOT NULL REFERENCES dbentry(id)&
     &ON UPDATE CASCADE ON DELETE CASCADE,&
     &dimension1 INTEGER NOT NULL,&       ! change dimension1 for e.g. ndof
     &dimension2 INTEGER NOT NULL,&       ! change dimension2 or delete
     &value DOUBLE NOT NULL)'             ! change DOUBLE if not a real
      call sqlite3_do( db, cmd )
      cmd = 'CREATE INDEX newidx ON newtable(id)'   ! change "newidx" and "newtable"
      call sqlite3_do( db, cmd )

4. Edit lib/db_sql/db_sql.F90 ***** New read / write routines ****

Add dbrdnewtable to public statement in header

New subroutine to read data dbrdnewtable

change "newtable" and name of "data"
!----------------
      subroutine dbrdnewtable( db, id, data )
      implicit none
      type(SQLITE_DATABASE)                    :: db
      integer(long), intent(in)                :: id
      real(dop), dimension(:)                  :: data

      type(SQLITE_COLUMN), dimension(:), pointer  :: cols_p
      type(SQLITE_STATEMENT)                   :: stmt
      character(len=c1)                        :: whereclause
      logical(kind=4)                          :: finished
      integer(long)                            :: n

      call sqlite3_query_table( db, 'newtable', cols_p )
      write(whereclause, '(a,i0)') 'WHERE id=', id
      call sqlite3_prepare_select(db, 'newtable', cols_p, stmt, &
                                  trim(whereclause))
      do
         call sqlite3_next_row(stmt, cols_p, finished)
         if (finished) exit
         call sqlite3_get_column( cols_p(2), n )
         call sqlite3_get_column( cols_p(3), data(n)) )
      enddo
      deallocate (cols_p)
      call sqlite3_finalize( stmt )

      end subroutine dbrdnewtable

New subroutine to write data dbwrnewtable

change "newtable" and names / dimensions for "dim" and "data"
      subroutine dbwrnewtable(db, irecno, dim, data)
!     For this (and all dbwr* routines) you should be inside a transaction.
!     Use dbstarttrans, dbwr*, ..., dbcommittrans

      type(SQLITE_DATABASE)                    :: db
      integer(long), intent(in)                :: dim, irecno
      real(dop), dimension(:), intent(in)      :: data
      type(SQLITE_COLUMN), dimension(:), pointer  :: cols_p
      integer(long)                            :: n
      character(len=250)                       :: cmd

      write(cmd, '(a, i0)') 'DELETE FROM newtable WHERE id=', irecno
      call sqlite3_do(db, cmd)
      call sqlite3_query_table( db, 'newtable', cols_p )
      call sqlite3_set_column( cols_p(1), irecno )
      do n = 1, dim
         call sqlite3_set_column( cols_p(2), n )
         call sqlite3_set_column( cols_p(3), data(n) )
         call sqlite3_insert( db, 'newtable', cols_p )
      enddo

      deallocate (cols_p)

      end subroutine dbwrnewtable

5. In preparedb (opfuncs/dd_db.f90)

Add calls to dbrdnewtable for all modus (if appropriate) under

!-----------------------------------------------------------------------
! load DB into memory if ldbsave
!-----------------------------------------------------------------------

         do irec=1,dbnrec
            call dbrdentry( the_db, irec, dbinterp(irec) )
            call dbrdgeo( the_db, irec, dbgeo(:,irec) )
      .......
            if (lnewdb) call dbnewtable( the_db, irec, dbnew(:,irec) )
Also update call to getdbentries

6. In dddb_wr1 (opfuncs/dd_db.f90)

Add call to dbwrnewtable (change: lnewdb, newtable, dim, data)

      if (lnewdb) call dbwrnewtable( the_db, rowid, dim, data )
Also add an entry in dddb_wrmem to store data in the dbnewdata array.

7. Ensure Backwards Compatibility

So that the SQL DB can be transformed into a flat file DB for examination,
  1. add a new channel to channels.inc, and filenames for ascii / binary DB files to include/dirdyn.f90.
  2. add open / close to opendbold, closedb, chkdb (opfuncs/dd_db.f90)
  3. update convdbfromsql, convtosql, mergedb