Probabilistic Modelling
We will develop methods to convert the stream of sensor data into a description of the household's energy-using behaviours that can be used for feedback. The available data will include time-series for electricity and gas demand, room temperatures, humidities and light-levels, and the temperature of key hot water pipes, and additional data such as weather, number of occupants, etc. The information sent to the Multimodal Feedback component will include standard household consumption statistics, such as energy used (kWh), carbon footprint (kg CO2) and imputed cost (£), aggregated over the preceding day, week and month; but the key will be computed inferences about behaviour, including:
Intuitively, the models will segment the sensor stream according to where it changes sharply, for example, when the gas demand and the bathroom humidity both increase. Crucially, this segmentation can be done in a minimally supervised fashion. That is, we do not expect that training the models will require us to collect a detailed log of behaviour times. Instead, we will develop unsupervised models that simultaneously segment the stream and cluster segments with a similar profile. Then we will manually label the resulting clusters, assigning labels to the clustered segments, such as shower and refrigerator, by combining our own expert knowledge with the survey data about the households.
This component will address two additional practical challenges. The first is the need to keep up with the continuing data stream. Here we will use advanced approximation techniques, such as online variational inference and sequential Monte Carlo. A second challenge will be in handling anomalies in the sensor data, e.g., due to sensor failure, and long-term changes, such as new household occupants. We will deal with anomalies using a combination of heuristics (e.g., ignore sensors that stop reporting) and probabilistic approaches (e.g., look for individual sensor readings that are extremely unlikely according to our model), and with long-term changes using change-point detection, combined with information from the 6-monthly surveys.
- Thermostat setting and, if there are thermostatic radiator valves, the settings for each room, with changes inferred using change point detection.
- Whether people are using each room and if they are awake or sleeping, using survey and sensor data (e.g., lights tend to be off when people are sleeping; humidity and temperature tend to go up a little when people are in a room).
- Hot water behaviours such as washing-up, showers, and cooking, which can be inferred from sensor data from hot-water pipe and rooms (temperature and humidity) sensors.
- Behaviours related to electricity use such as major appliances (refrigerator, dishwasher, etc.), disaggregated from the load profile.
Intuitively, the models will segment the sensor stream according to where it changes sharply, for example, when the gas demand and the bathroom humidity both increase. Crucially, this segmentation can be done in a minimally supervised fashion. That is, we do not expect that training the models will require us to collect a detailed log of behaviour times. Instead, we will develop unsupervised models that simultaneously segment the stream and cluster segments with a similar profile. Then we will manually label the resulting clusters, assigning labels to the clustered segments, such as shower and refrigerator, by combining our own expert knowledge with the survey data about the households.
This component will address two additional practical challenges. The first is the need to keep up with the continuing data stream. Here we will use advanced approximation techniques, such as online variational inference and sequential Monte Carlo. A second challenge will be in handling anomalies in the sensor data, e.g., due to sensor failure, and long-term changes, such as new household occupants. We will deal with anomalies using a combination of heuristics (e.g., ignore sensors that stop reporting) and probabilistic approaches (e.g., look for individual sensor readings that are extremely unlikely according to our model), and with long-term changes using change-point detection, combined with information from the 6-monthly surveys.
