Machine learning provides predictive analysis into silver nanoparticle protein corona formation from physicochemical properties

Environ Sci Nano. 2018 Jan 1;5(1):64-71. doi: 10.1039/C7EN00466D. Epub 2017 Nov 1.

Abstract

Proteins encountered in biological and environmental systems bind to engineered nanomaterials (ENMs) to form a protein corona (PC) that alters the surface chemistry, reactivity, and fate of the ENMs. Complexities such as the diversity of the PC and variation with ENM properties and reaction conditions make the PC population difficult to predict. Here, we support the development of predictive models for PC populations by relating biophysicochemical characteristics of proteins, ENMs, and solution conditions to PC formation using random forest classification. The resulting model offers a predictive analysis into the population of PC proteins in Ag ENM systems of various ENM size and surface coatings. With an area under the receiver operating characteristic curve of 0.83 and F1-score of 0.81, a model with strong performance has been constructed based upon experimental data. The weighted contribution of each variable provides recommendations for mechanistic models based upon protein enrichment classification results. Protein biophysical properties such as pI and weight are weighted heavily. Yet, ENM size, surface charge, and solution ionic strength also proved essential to an accurate model. The model can be readily modified and applied to other ENM PC populations. The model presented here represents the first step toward robust predictions of PC fingerprints.