Cytochrome P450 aromatase (CYP19A1) plays a key role in the development of estrogen dependent breast cancer, and aromatase inhibitors have been at the front line of treatment for the past three decades. The development of potent, selective and safer inhibitors is ongoing with in silico screening methods playing a more prominent role in the search for promising lead compounds in bioactivity-relevant chemical space. Here we present a set of comprehensive binding affinity prediction models for CYP19A1 using our automated Linear Interaction Energy (LIE) based workflow on a set of 132 putative and structurally diverse aromatase inhibitors obtained from a typical industrial screening study. We extended the workflow with machine learning methods to automatically cluster training and test compounds in order to maximize the number of explained compounds in one or more predictive LIE models. The method uses protein-ligand interaction profiles obtained from Molecular Dynamics (MD) trajectories to help model search and define the applicability domain of the resolved models. Our method was successful in accounting for 86% of the data set in 3 robust models that show high correlation between calculated and observed values for ligand-binding free energies (RMSE < 2.5 kJ mol-1), with good cross-validation statistics.