Abstract:
© 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Generative Topographic Mapping (GTM) approach was successfully used to visualize, analyze and model the equilibrium constants (KT) of tautomeric transformations as a function of both structure and experimental conditions. The modeling set contained 695 entries corresponding to 350 unique transformations of 10 tautomeric types, for which KT values were measured in different solvents and at different temperatures. Two types of GTM-based classification models were trained: first, a “structural” approach focused on separating tautomeric classes, irrespective of reaction conditions, then a “general” approach accounting for both structure and conditions. In both cases, the cross-validated Balanced Accuracy was close to 1 and the clusters, assembling equilibria of particular classes, were well separated in 2-dimentional GTM latent space. Data points corresponding to similar transformations measured under different experimental conditions, are well separated on the maps. Additionally, GTM-driven regression models were found to have their predictive performance dependent on different scenarios of the selection of local fragment descriptors involving special marked atoms (proton donors or acceptors). The application of local descriptors significantly improves the model performance in 5-fold cross-validation: RMSE=0.63 and 0.82 logKT units with and without local descriptors, respectively. This trend was as well observed for SVR calculations, performed for the comparison purposes.