Capturing Weak Interactions in Surface Adsorbate Systems at Coupled Cluster Accuracy: A Graph-Theoretic Molecular Fragmentation Approach Improved through Machine Learning

J Chem Theory Comput. 2023 Dec 12;19(23):8541-8556. doi: 10.1021/acs.jctc.3c00955. Epub 2023 Nov 29.

Abstract

The accurate and efficient study of the interactions of organic matter with the surface of water is critical to a wide range of applications. For example, environmental studies have found that acidic polyfluorinated alkyl substances, especially perfluorooctanoic acid (PFOA), have spread throughout the environment and bioaccumulate into human populations residing near contaminated watersheds, leading to many systemic maladies. Thus, the study of the interactions of PFOA with water surfaces became important for the mitigation of their activity as pollutants and threats to public health. However, theoretical study of the interactions of such organic adsorbates on the surface of water, and their bulk concerted properties, often necessitates the use of ab initio methods to properly incorporate the long-range electronic properties that govern these extended systems. Notable theoretical treatments of "on-water" reactions thus far have employed hybrid DFT and semilocal DFT, but the interactions involved are weak interactions that may be best described using post-Hartree-Fock theory. Here, we aim to demonstrate the utility of a graph-theoretic approach to molecular fragmentation that accurately captures the critical "weak" interactions while maintaining an efficient ab initio treatment of the long-range periodic interactions that underpin the physics of extended systems. We apply this graph-theoretical treatment to study PFOA on the surface of water as a model system for the study of weak interactions seen in the wide range of surface interactions and reactions. The approach divides a system into a set of vertices, that are then connected through edges, faces, and higher order graph theoretic objects known as simplexes, to represent a collection of locally interacting subsystems. These subsystems are then used to construct ab initio molecular dynamics simulations and for computing multidimensional potential energy surfaces. To further improve the computational efficiency of our graph theoretic fragmentation method, we use a recently developed transfer learning protocol to construct the full system potential energy from a family of neural networks each designed to accurately model the behavior of individual simplexes. We use a unique multidimensional clustering algorithm, based on the k-means clustering methodology, to define our training space for each separate simplex. These models are used to extrapolate the energies for molecular dynamics trajectories at PFOA water interfaces, at less than one-tenth the cost as compared to a regular molecular fragmentation-based dynamics calculation with excellent agreement with couple cluster level of full system potential energies.