Computer Science Colloquia
Tuesday, September 22, 2015
Advisor: Kamin Whitehouse
Attending Faculty: Marty Humphrey (Committee Chair), Yanjun Qi, and Alfred Weaver.
9:30 AM, Rice Hall, Rm. 404
PhD Qualifying Exam Presentation
The Building Adapter: Towards Quickly Applying Building Analytics at Scale
Commercial and industrial buildings account for a considerable fraction of all the energy consumed in the U.S., and reducing this energy consumption has become a national grand challenge. Based on the large deployment of sensors in modern commercial buildings, many organizations are applying data analytic solutions to the thousands of sensing and control points to detect wasteful, incorrect and inefficient operations for energy savings. Scaling this approach is challenging, however, because the metadata about these sensing and control points is inconsistent between buildings, or even missing altogether. As a result, an analytics engine cannot be applied to a new building without first addressing the issue of mapping: creating a match between the sensor streams and the inputs of a data analytic engine. The mapping process requires significant integration effort and anecdotally can take a week or longer for each commercial building. Thus, metadata mapping is a major obstacle to scaling up building analytics.
In this work, we demonstrate first steps towards an automatic metadata mapping solution that requires minimal human intervention. We develop two different techniques, i.e., fully automated mapping and semi-automated mapping, to differentiate sensors in buildings by type, e.g., temperature v.s. humidity. Our first technique performs automatic mapping without any manual intervention. The approach builds on and improves upon techniques from transfer learning: it learns a set of statistic classification classifiers based on the metadata from a labeled building and adaptively integrates those models to another unlabeled building, even if the two buildings have very different metadata conventions. The second approach involves iterative manual labeling where a clustering-based active learning algorithm exploits data clustering structure to acquire human labels for informative instances and propagates labels to their nearby unlabeled neighbors to accelerate the learning process.
We perform a comprehensive study on a data set collected from over 20 different sensor types and 2,500 sensor streams in three commercial buildings on two campuses. The transfer learning based solution can automatically label at least 36% of the points with more than 85% accuracy, while the best baseline achieves only 63% label accuracy on average. Our active learning based technique is able to achieve more than 92% accuracy for type classification with much less labeled instances than baselines. As a proof-of-concept, we also demonstrate a typical analytic application enabled by the normalized metadata. These techniques represent a first step towards technology that would enable any new building analytics engine to scale quickly to the 10's of millions of commercial buildings across the globe, with the minimal need for manual mapping on a per-building basis.