A three-year dataset supporting research on building energy management and occupancy analytics

Sci Data. 2022 Apr 5;9(1):156. doi: 10.1038/s41597-022-01257-x.


This paper presents the curation of a monitored dataset from an office building constructed in 2015 in Berkeley, California. The dataset includes whole-building and end-use energy consumption, HVAC system operating conditions, indoor and outdoor environmental parameters, as well as occupant counts. The data were collected during a period of three years from more than 300 sensors and meters on two office floors (each 2,325 m2) of the building. A three-step data curation strategy is applied to transform the raw data into research-grade data: (1) cleaning the raw data to detect and adjust the outlier values and fill the data gaps; (2) creating the metadata model of the building systems and data points using the Brick schema; and (3) representing the metadata of the dataset using a semantic JSON schema. This dataset can be used in various applications-building energy benchmarking, load shape analysis, energy prediction, occupancy prediction and analytics, and HVAC controls-to improve the understanding and efficiency of building operations for reducing energy use, energy costs, and carbon emissions.