Abstract:
© 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim The study focused on QSAR model interpretation. The goal was to develop a workflow for the identification of molecular fragments in different contexts important for the property modelled. Using a previously established approach – Structural and physicochemical interpretation of QSAR models (SPCI) – fragment contributions were calculated and their relative influence on the compounds’ properties characterised. Analysis of the distributions of these contributions using Gaussian mixture modelling was performed to identify groups of compounds (clusters) comprising the same fragment, where these fragments had substantially different contributions to the property studied. SMARTSminer was used to detect patterns discriminating groups of compounds from each other and visual inspection if the former did not help. The approach was applied to analyse the toxicity, in terms of 40 hour inhibition of growth, of 1984 compounds to Tetrahymena pyriformis. The results showed that the clustering technique correctly identified known toxicophoric patterns: it detected groups of compounds where fragments have specific molecular context making them contribute substantially more to toxicity. The results show the applicability of the interpretation of QSAR models to retrieve reasonable patterns, even from data sets consisting of compounds having different mechanisms of action, something which is difficult to achieve using conventional pattern/data mining approaches.