Machine learning for the New York City power grid

Cynthia Rudin; David Waltz; Roger N Anderson; Albert Boulanger; Ansaf Salleb-Aouissi; Maggie Chow; Haimonti Dutta; Philip N Gross; Bert Huang; Steve Ierome; Delfina F Isaac; Arthur Kressner; Rebecca J Passonneau; Axinia Radeva; Leon Wu

doi:10.1109/TPAMI.2011.108

Machine learning for the New York City power grid

IEEE Trans Pattern Anal Mach Intell. 2012 Feb;34(2):328-45. doi: 10.1109/TPAMI.2011.108.

Authors

Cynthia Rudin¹, David Waltz, Roger N Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Maggie Chow, Haimonti Dutta, Philip N Gross, Bert Huang, Steve Ierome, Delfina F Isaac, Arthur Kressner, Rebecca J Passonneau, Axinia Radeva, Leon Wu

Affiliation

¹ MIT Sloan School of Management, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA. rudin@mit.edu

PMID: 21576741
DOI: 10.1109/TPAMI.2011.108

Abstract

Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce 1) feeder failure rankings, 2) cable, joint, terminator, and transformer rankings, 3) feeder Mean Time Between Failure (MTBF) estimates, and 4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or realtime, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City’s electrical grid.