[Some of] my research explained
Succinctly and in plain English for the impatient layperson...
Learning Spatio-Temporal Aggregations for Large-Scale Capacity Expansion Problems
Planning for energy systems in the long term (say 30 years from now) involves solving optimization models. The outputs of these planning models are investment decisions (e.g., construct a natural gas-fired power plant here, invest in solar energy generation there) that
 minimize the combined cost of upfront investment and hour-to-hour operations,
meet projected energy demands, and
respect decarbonization goals that place limits on carbon emissions.
Ideally, we would solve these optimization models at a high resolution in both space (at a county level) and time (hour-by-hour for a full year). However, this would take days or weeks of non-stop computing. One workaround for this is to aggregate the problem both spatially, by creating a smaller network to represent the full network, and temporally, by including a subset of time periods rather than every hour for a full year. Solving this aggregated problem only takes a few minutes, and one can hope that it would yield similar decisions compared to solving the original problem. However, an automated method for aggregating this problem would need to discover and exploit spatial and temporal patterns in the high-dimensional data that enters into the optimization problem.
Solution to the aggregated optimization problem yields generation decisions at a low spatial resolution.
We can disaggregate the generation decisions to retrieve decisions at the original spatial resolution.
In recent years, machine learning researchers have developed graph convolutional neural networks, which are able to recognize patterns in high-dimensional data on network structures. These have been applied with resounding success in molecular biology and traffic modeling, among other fields. In our work, we apply these graph convolutional modeling methods to automatically identify patterns in the high-dimensional data that enters into our optimization model (i.e. time- and location-specific power demands, natural gas demands, and renewable generation capacities). This allows us to compress the high-dimensional data while preserving the most important spatial and temporal patterns, which gives us a natural choice of aggregation for the problem. We compare our approach to other benchmark methods and find that the investment decisions which come from our aggregation methods are able to meet energy demands and decarbonization goals for the New England power and natural gas network at a lower cost!
Interpretable Machine Learning Models for Modal Split Prediction in Transportation Systems
Traffic system operators and public transit authorities benefit from accurate short-term demand forecasts for different modes of transportation. These predictions allow them to anticipate changes in traffic congestion or public transit ridership and respond accordingly. However, accurately predicting the modal split, or how the total travel demand at any time will be distributed across multiple modes of transportation (e.g. driving, public transit, etc.), is rather difficult. In particular, this modal split reacts differently to changes in travel times across different parts of the road network. For instance, if there is an accident on a freeway, some fraction of the travelers in the system will react by deciding to take public transit to avoid traffic. However, a different fraction will change their travel mode if an accident occurs on a different part of the same freeway. Moreover, transportation authorities do not typically have access to historical data at the individual level. Instead, they can only observe aggregate behavior using aggregate transit ridership data and sensor readings on the freeway network.
Increased congestion on the freeway network yields less-than-typical demand for driving and causes a rise in demand for public transit.
To address this challenge, we estimate a predictive model for the fraction of total travelers choosing one mode of transportation over another using a high-dimensional dataset of freeway travel times. This model takes inspiration from classical discrete choice modeling, a longstanding field that borrows methods from statistics and economics to model decision-making behavior, but also leverages more recent developments in machine learning and high-dimensional statistics to minimize overfitting, an adverse effect that occurs in settings where data is high-dimensional and relatively limited in availability. Our models not only achieve high predictive accuracy, but also yield interesting behavioral interpretations that give us some insight into how averse travelers are to traffic congestion across different parts of the freeway network!