Workshop

Time Series Workshop

Yian Ma, Ehi Nosakhare, Yuyang Wang, Scott Yang, Rose Yu

Abstract:

Time series is one of the fastest growing and richest types of data. In a variety of domains including dynamical systems, healthcare, climate science and economics, there have been increasing amounts of complex dynamic data due to a shift away from parsimonious, infrequent measurements to nearly continuous real-time monitoring and recording. This burgeoning amount of new data calls for novel theoretical and algorithmic tools and insights.

The goals of our workshop are to: (1) highlight the fundamental challenges that underpin learning from time series data (e.g. covariate shift, causal inference, uncertainty quantification), (2) discuss recent developments in theory and algorithms for tackling these problems, and (3) explore new frontiers in time series analysis and their connections with emerging fields such as causal discovery and machine learning for science. In light of the recent COVID-19 outbreak, we also plan to have a special emphasis on non-stationary dynamics, causal inference, and their applications to public health at our workshop.

Time series modeling has a long tradition of inviting novel approaches from many disciplines including statistics, dynamical systems, and the physical sciences. This has led to broad impact and a diverse range of applications, making it an ideal topic for the rapid dissemination of new ideas that take place at ICML. We hope that the diversity and expertise of our speakers and attendees will help uncover new approaches and break new ground for these challenging and important settings. Our previous workshops have received great popularity at ICML, and we envision our workshop will continue to appeal to the ICML audience and stimulate many interdisciplinary discussions.

Gather.Town:
Morning Poster: https://eventhosts.gather.town/app/4H5nUSQOWXc9pC43/tsw-poster-room-1
Afternoon Poster: https://eventhosts.gather.town/app/zIejrcEqf10T4UKT/tsw-poster-room-2

Chat is not available.

Timezone: »

Schedule

Sat 8:45 a.m. - 9:00 a.m.
Openning Remarks (Introduction)   
Sat 9:00 a.m. - 9:45 a.m.
Mihaela Van der Schaar: Time-series in healthcare: challenges and solutions (Invited Talk)   
Mihaela van der Schaar
Sat 9:45 a.m. - 10:30 a.m.
  

Bayesian multiscale models exploit variants of the “decouple/recouple'' concept to enable advances in forecasting and monitoring of increasingly large-scale time series. Recent and current applications include financial and commercial forecasting, as well as dynamic network studies. I overview some recent developments via examples from applications in large-scale consumer demand and sales forecasting with intersecting marketing related goals. Two coupled applied settings involve (a) models for forecasting daily sales of each of many items in every supermarket of a large national chain, and (b) models for understanding and forecasting customer/household-specific purchasing behavior to informs decisions about personalized pricing and promotions on a continuing basis. The multiscale concept is applied in each setting to define new classes of hierarchical Bayesian state-space models customized to the application. In each area, micro-level, individual time series are represented via customized model forms that also involve aggregate-level factors, the latter being modelled and forecast separately. The implied conditional decoupling of many time series enables computational scalability, while the effects of shared multiscale factors define recoupling to appropriately reflect cross-series dependencies. The ideas are of course relevant to other applied settings involving large-scale, hierarchically structured time series.

Sat 10:30 a.m. - 10:45 a.m.
Morning Coffee Break (Break)
Sat 10:45 a.m. - 11:00 a.m.
Contributed Talk: JKOnet: Proximal Optimal Transport Modeling of Population Dynamics (Contributed Talk)   
Charlotte Bunne
Sat 11:00 a.m. - 11:40 a.m.
  

Quantification of causal influence is a non-trivial conceptual problem. Well-known concepts like Granger causality and transfer entropy are arguably correct to detect the presence of causal influence (subject to assumptions like causal sufficiency and positive probability density), but following [2] I argue that taking them as measure for the strength of causal influence is conceptually flawed. To discuss this, I consider the more general question of quantifying the strength of an edge (or a set of edges) in a causal DAG. I describe a few postulates that we [1] would expect from a measure of causal influence and describe the information theoretic casual strength that we proposed in [1]. Reference: [1] D. Janzing, D. Balduzzi, M. Grosse-Wentrup, B. Schölkopf: Quantifying causal influences. Annals of Statistics, 2013. [2] N. Ay and D. Polani: Information flow in causal networks, 2008.

Dominik Janzing
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D0 in Virtual World ]

Changepoint Detection methods aim to find locations where a time series shows abrupt changes in properties, such as level and trend, which persist with time. Traditional parametric approaches assume specific generative models for each segment of the time series, but often, the complexities of real time series data are hard to capture with such models. To address these issues, in this paper, we propose VAE-CP, which uses a variational autoencoder with self supervised loss functions to learn informative latent representations of time series segments. We use traditional hypothesis test based and Bayesian changepoint methods in this latent space of normally distributed latent variables, thus combining the strength of self-supervised representation learning, with parametric changepoint modeling. This proposed approach outperforms traditional and previous deep learning based changepoint detection methods in synthetic and real datasets containing trend changes.

Sourav Chatterjee
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C6 in Virtual World ]

We derive tight probabilistic bounds on the first hitting time of general classes of contractive nonlinear time series models that can be linked to mean reverting processes. As an application to finance, we translate our results to a pairs trading strategy with probabilistic guarantees on its returns.

Julien Huang
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C5 in Virtual World ]  link »
We consider the framework of non-stationary Online Convex Optimization where a learner seeks to control its \emph{dynamic regret} against an \emph{arbitrary} sequence of comparators. When the loss functions are strongly convex or exp-concave, we demonstrate that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret in terms of \emph{path variation} $V_T$ \emph{of the comparator sequence}. Specifically, we show that SA algorithms enjoy $\tilde O(\sqrt{TV_T} \vee \log T)$ and $\tilde O(\sqrt{dTV_T} \vee d\log T)$ dynamic regret for strongly convex and exp-concave losses respectively \emph{without} apriori knowledge of $V_T$, thus answering an open question in \cite{zhang2018dynamic}. The versatility of the principled approach is further demonstrated by the novel results in the setting of learning against bounded linear predictors and online regression with Gaussian kernels.
Dheeraj Baby
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C4 in Virtual World ]  link »

We propose an extended Hawkes process model where the self--effects are of both excitatory and inhibitory type and follow a Gaussian Process. Whereas previous work either relies on a less flexible parameterization of the model, or requires a large amount of data, our formulation allows for both a flexible model and learning when data are scarce. Efficient approximate Bayesian inference is achieved via data augmentation, and we describe a mean--field variational inference approach to learn the model parameters. To demonstrate the flexibility of the model we apply our methodology on data from two different domains and compare it to previously reported results.

Noa Malem-Shinitski
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C3 in Virtual World ]

Here, we propose a general method for probabilistic time series forecasting. We combine an autoregressive recurrent neural network to model temporal dynamics with Implicit Quantile Networks to learn a large class of distributions over a time-series target. When compared to other probabilistic neural forecasting models on real- and simulated data, our approach is favorable in terms of point-wise prediction accuracy as well as on estimating the underlying temporal distribution.

Adele Gouttes
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C2 in Virtual World ]  link »

Inspired by the demands of real-time time-series forecasting, we develop and analyze optimistic online learning algorithms under delayed feedback. We present a novel "delay as optimism" analysis that reduces online learning under delay to optimistic online learning. This reduction enables optimal regret bounds for delayed online learning and exposes how side-information or optimistic "hints" can be used to combat the effects of delay. We use these theoretical tools to develop the first optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delay. These algorithms --- DORM, DORM+, and AdaHedgeD --- are robust and practical choices for real-world time-series forecasting. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models.

Genevieve Flaspohler
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C1 in Virtual World ]

Graph Gaussian Processes (GGPs) provide a data-efficient solution on graph structured domains. Existing approaches have focused on static structures, whereas many real graph data represent a dynamic structure, limiting the applications of GGPs. To overcome this we propose evolving-Graph Gaussian Processes (e-GGPs). The proposed method is capable of learning the transition function of graph vertices over time with a neighbourhood kernel to model the connectivity and interaction changes between vertices. We assess the performance of our method on time-series regression problems where graphs evolve over time. We demonstrate the benefits of e-GGPs over static graph Gaussian Process approaches.

David Blanco-Mulero
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot C0 in Virtual World ]

We present a novel variational bayesian approach for time series forecasting following from a state-space representation, named VIKING (Variational BayesIan Variance TracKING. The method is illustrated with the procedure used to win a recent competition on post-covid electricity load forecasting.

Joseph de Vilmarest
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B6 in Virtual World ]

We present a flexible, scalable, and interpretable framework for automated forecasting of multivariate time-series, building off of the Bayesian Vector Autoregression (BVAR) literature in macroeconometrics. Our algorithm allows for full posterior estimates of hundreds of interaction parameters, with minimal hand-tuning or hyperparameter specification required. The model can be easily extended to account for non-stationary breaks such as the COVID-19 pandemic. In experiments our model outperforms comparably-flexible time-series models at forecasting inflation.

Rishab Guha
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B5 in Virtual World ]

We develop a new framework for training hidden Markov models that balances generative and discriminative goals. Our approach requires likelihood-based or Bayesian learning to meet task-specific prediction quality constraints, preventing model misspecification from leading to poor subsequent predictions. When users specify an appropriate loss function to constrain predictions, our approach can enhance semi-supervised learning when labeled sequences are rare and boost accuracy when data has unbalanced label frequencies. Via automatic differentiation we backpropagate gradients through dynamic programming computation of the marginal likelihood, making training feasible without auxiliary bounds or approximations. Our approach is effective for human activity modeling and healthcare intervention forecasting, delivering accuracies competitive with well-tuned neural networks for fully labeled data, and substantially better for partially labeled data. Simultaneously, our learned generative model illuminates the dynamical states driving predictions.

Gabriel Hope
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B4 in Virtual World ]

Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits, including the ability to generate continuous trajectories and to perform inference on previously unseen time stamps. Despite exciting progress in this area, the existing models still face challenges in terms of their representational power and the quality of their variational approximations. We tackle these challenges with continuous latent process flows (CLPF), a principled architecture decoding continuous latent processes into continuous observable processes using a time-dependent normalizing flow driven by a stochastic differential equation. To optimize our model using maximum likelihood, we propose a novel piecewise construction of a variational posterior process and derive the corresponding variational lower bound using trajectory re-weighting. Our model shows favourable performance on synthetic data simulated from stochastic processes.

Ruizhi Deng
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B3 in Virtual World ]

Realistic synthetic time series data of sufficient length enables practical applications in time series modeling tasks, such as forecasting, but remains to be a challenge. In this paper we present PSA-GAN, a generative adversarial network (GAN) that generates long time series samples of high quality using progressive growing of GANs and self-attention. We show that PSA-GAN can be used to reduce the error in two downstream forecasting tasks over baselines that only use real data. We also introduce a Frechet-Inception Distance-like score, Context-FID, assessing the quality of synthetic time series samples. In our downstream tasks, we find that this score is able to predict the best-performing models and could therefore be a useful tool to develop time series GAN models for downstream use.

Paul Jeha, Pedro Mercado
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B2 in Virtual World ]

We use a Neural Ordinary Differential Equation to model and predict the seasonal to interannual variability of El Niño Southern Oscillation (ENSO). We train our neural network model using partial observations involving only sea surface temperature data. Our approach is computationally inexpensive, it reproduces the main seasonal features of ENSO, and exhibits robust predictions skills.

Ludovico Giorgini
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B1 in Virtual World ]

Physical reasoning requires forward prediction: the ability to forecast what will happen next given some initial world state. We study the performance of state-of-the-art forward-prediction models in the complex physical-reasoning tasks of the PHYRE benchmark (Bakhtin et al., 2019). We do so by incorporating models that operate on object or pixel-based representations of the world into simple physical-reasoning agents. We find that forward-prediction models can improve physical-reasoning performance, particularly on complex tasks that involve many objects. However, we also find that these improvements are contingent on the test tasks being small variations of train tasks, and that generalization to completely new task templates is challenging. Surprisingly, we observe that forward predictors with better pixel accuracy do not necessarily lead to better physical-reasoning performance. Nevertheless, our best models set a new state-of-the-art on the PHYRE benchmark. Our code and models will be released online.

Rohit Girdhar
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot B0 in Virtual World ]

Explanation methods applied to sequential models for multivariate time series prediction are receiving more attention in machine learning literature. While current methods perform well at providing instance-wise explanations, they struggle to efficiently and accurately make attributions over long periods of time and with complex feature interactions. We propose WinIT, a framework for evaluating feature importance in time series prediction settings by quantifying the shift in predictive distribution over multi-instance predictions in a windowed setting. Comprehensive empirical evidence shows our method improves on the previous state-of-the-art, FIT, by capturing temporal dependencies in feature importance. We also demonstrate how the solution improves the appropriate attribution of features within time steps, which existing interpretability methods often fail to do. We compare with baselines on simulated and real-world clinical data. WinIT achieves 2.04x better performance than FIT and other feature importance methods on real-world data.

Clayton Rooke
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A6 in Virtual World ]

Simulation-based inference (SBI) has emerged as a family of methods for performing inference on complex simulation models with intractable likelihood functions. A common bottleneck in SBI is the construction of low-dimensional summary statistics of the data. In this respect, time-series data, often being high-dimensional, multivariate,and complex in structure, present a particular challenge. To address this we introduce deep signature statistics, a principled and automated method for combining summary statistic selection for time-series data with neural SBI methods. Our approach leverages deep signature transforms, trained concurrently with a neural density estimator, to produce informative statistics for multivariate sequential data that encode important geometric properties of the underlying path. We obtain competitive results across benchmark models.

Joel Dyer
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A5 in Virtual World ]

With the increasing adoption of renewable energy generation and different types of electric devices, electric load forecasting, especially short-term load forecasting (STLF), is attracting more and more attention. Accurate short-term load forecasting is of significant importance for the safety and efficiency of power grids. Deep learning based models have shown impressive success on several applications including short-term load forecasting. However, for several real-world scenarios, it may be very difficult or even impossible to collect enough training data to learn a reliable machine learning model. Specifically, we first proposed In this paper, we propose an instance transfer-based transfer learning algorithm to assist the learning performance for the short-term load forecasting. The proposed algorithm is evaluated on several real-world data sets and has shown significant improvements over the baselines.

Tracy Cui
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A4 in Virtual World ]

Consider a heterogeneous population of points evolving with time. While the population evolves, both in size and nature, we can observe it periodically, through snapshots taken at different timestamps. Each of these snapshots is formed by sampling points from the population at that time, and then creating features to recover point clouds. While these snapshots describe the population's evolution on aggregate, they do not provide directly insights on individual trajectories. This scenario is encountered in several applications, notably single-cell genomics experiments, tracking of particles or crowd monitoring. In this paper, we propose to model that dynamic as resulting from the celebrated Jordan-Kinderlehrer-Otto (JKO) proximal scheme. The JKO scheme posits that the configuration taken by a population at time t is one that trades off a decrease w.r.t. an energy (the model we seek to learn) penalized by an optimal transport distance w.r.t. the previous configuration. To that end, we propose JKOnet, a neural architecture that combines an energy model on measures, with (small) optimal displacements solved with input convex neural networks (ICNN). We demonstrate the applicability of our model to explain and predict population dynamics.

Charlotte Bunne
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A3 in Virtual World ]

We propose Piecewise Locally Stationary Oscillation (PLSO) state-space model for decomposing nonstationary time series with slowly time-varying spectra into several oscillatory, piecewise-stationary processes. PLSO combines piecewise stationarity in classical signal processing and stationary Gaussian process kernels, effectively addressing the drawbacks of these ideas, such as inefficient inference and discontinuous/distorted estimates across stationary interval boundaries.

Andrew Song
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A2 in Virtual World ]

We describe variational inference approaches in Gaussian process state space models in terms of local linearisations of the approximate posterior function. Most previous approaches have either assumed independence between the posterior dynamics and latent states (the mean-field (MF) approximation), or optimised free parameters for both, leading to limited scalability. We use our framework to prove that (i) there is a theoretical imperative to use non-MF approaches, to avoid excessive bias in the process noise hyperparameter estimate, and (ii) we can parameterise only the posterior dynamics without any less of performance. Our approach suggests further approximations, based on the existing rich literature on filtering and smoothing for nonlinear systems, and unifies approaches for discrete and continuous time models.

Talay Cheema
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A1 in Virtual World ]

Understanding physical phenomena oftentimes means understanding the underlying dynamical system that governs observational measurements. While accurate prediction can be achieved with black box systems, they often lack interpretability and are less amenable for further expert investigation. Alternatively, the dynamics can be analysed via symbolic regression. In this paper, we extend the approach by (Udrescu et al., 2020) called AI Feynman to the dynamic setting to perform symbolic regression on ODE systems based on observations from the resulting trajectories. We compare this extension to state-of-the-art approaches for symbolic regression empirically on several dynamical systems for which the ground truth equations of increasing complexity are available. Although the proposed approach performs best on this benchmark, we observed difficulties of all the compared symbolic regression approaches on more complex systems, such as Cart-Pole.

Juliane Weilbach
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot A0 in Virtual World ]

Many applications such as recommender systems (RecSys) are built upon streams of events, each associated with a type in a large-cardinality set and a timestamp in the continuous domain. To date, most applied work is focused on the prediction of the type of the next event, i.e., which exact item a user may visit when they arrive at the RecSys. Instead, we aim to predict when and how often an event of a certain type will be visited by the given user, without the implicit assumption that they will arrive and consume exactly one item at a time. This perspective leads to unique applications in user recommendation (UserRec), where the RecSys is tasked to preemptively match users on behalf of the item producers for marketing purposes. We propose Recurrent Intensity Models (RIMs) that incorporate user visitation intensities in the RecSys, based on recent progress in temporal processes. To our knowledge, our work is the first to approach UserRec completely based on hidden temporal representations without heuristics from explicit feature engineering.

Yifei Ma
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D5 in Virtual World ]

The spread of COVID-19 has coincided with the rise of Graph Neural Networks (GNNs), leading to several studies proposing their use to better forecast the evolution of the pandemic. Many such models also include Long Short Term Memory (LSTM) networks, a common tool for time series forecasting. In this work, we further investigate the integration of these two methods by implementing GNNs within the gates of an LSTM and exploiting spatial information. In addition, we introduce a skip connection which proves critical to jointly capture the spatial and temporal patterns in the data. We validate our daily COVID-19 new cases forecast model on data of 37 European nations for the last 472 days and show superior performance compared to state-of-the-art graph time series models based on mean absolute scaled error (MASE). This area of research has important applications to policy-making and we analyze its potential for pandemic resource control.

Nate J Sesti
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D1 in Virtual World ]

In electronic trading markets often only the price or volume time series, that result from interaction of multiple market participants, are directly observable. In order to test trading strategies before deploying them to real-time trading, multi-agent market environments calibrated so that the time series that result from interaction of simulated agents resemble historical are often used. To ensure adequate testing, one must test trading strategies in a variety of market scenarios -- which includes both scenarios that represent ordinary market days as well as stressed markets (most recently observed due to the beginning of COVID pandemic). In this paper, we address the problem of multi-agent simulator parameter calibration to allow simulator capture characteristics of different market regimes. We propose a novel two-step method to train a discriminator that is able to distinguish between “real” and “fake” price and volume time series as a part of GAN with self-attention, and then utilize it within an optimization framework to tune parameters of a simulator model with known agent archetypes to represent a market scenario. We conclude with experimental results that demonstrate effectiveness of our method.

Victor Storchan, Svitlana Vyetrenko
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D2 in Virtual World ]

We study a recent class of models which uses graph neural networks (GNNs) to improve forecasting in multivariate time series. The core assumption behind these models is that there is a latent graph between the time series (nodes) that governs the evolution of the multivariate time series. By parameterizing a graph in a differentiable way, the models aim to improve forecasting quality. We compare four recent models of this class on the forecasting task. Further, we perform ablations to study their behavior under changing conditions, e.g., when disabling the graph-learning modules and providing the ground-truth relations instead. Based on our findings, we propose novel ways of combining the existing architectures.

Daniel Zügner, Francois-Xavier Aubet
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D3 in Virtual World ]

Time series data are often corrupted by outliers or other kinds of anomalies. Identifying the anomalous points can be a goal on its own (anomaly detection), or a means to improving performance of other time series tasks (e.g.\ forecasting). Recent deep-learning-based approaches to anomaly detection and forecasting commonly assume that the proportion of anomalies in the training data is small enough to ignore, and treat the unlabeled data as coming from the nominal data distribution. We present a simple yet effective technique for augmenting existing time series models so that they explicitly account for anomalies in the training data. By augmenting the training data with a latent anomaly indicator variable whose distribution is inferred while training the underlying model using Monte Carlo EM, our method simultaneously infers anomalous points while improving model performance on nominal data. We demonstrate the effectiveness of the approach by combining it with a simple feed-forward forecasting model. We investigate how anomalies in the train set affect the training of forecasting models, which are commonly used for time series anomaly detection, and show that our method improves the training of the model.

Francois-Xavier Aubet
Sat 11:45 a.m. - 12:45 p.m.
[ Visit Poster at Spot D4 in Virtual World ]

We propose ST-DETR, a Spatio-TemporalTransformer-based architecture for object detec-tion from a sequence of temporal frames. We treatthe temporal frames as sequences in both spaceand time and employ the full attention mecha-nisms to take advantage of the features correla-tions over both dimensions. This treatment en-ables us to deal with frames sequence as temporalobject features traces over every location in thespace. We explore two possible approaches; theearly spatial features aggregation over the tempo-ral dimension, and the late temporal aggregationof object query spatial features. Moreover, wepropose a novel Temporal Positional Embeddingtechnique to encode the time sequence informa-tion. To evaluate our approach, we choose theMoving Object Detection (MOD) task, since it isa perfect candidate to showcase the importance ofthe temporal dimension. Results show a signifi-cant 5% mAP improvement on the KITTI MODdataset over the 1-step spatial baseline.

Eslam Mohamed Abd El Rahman
Sat 2:30 p.m. - 2:45 p.m.
Contributed Talk: PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series (Contributed Talk)   
Paul Jeha
Sat 2:45 p.m. - 3:25 p.m.
David Duvenaud (Invited Talk)   
David Duvenaud
Sat 3:25 p.m. - 3:30 p.m.
David Duvenaud: Live Q&A (Live Q&A)
Sat 3:30 p.m. - 3:45 p.m.
Afternoon Coffee Break (Break)
Sat 3:45 p.m. - 4:00 p.m.
Contributed Talk: Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data (Contributed Talk)   
Shixiang Zhu
Sat 4:00 p.m. - 4:45 p.m.
  

Inspired by the demands of real-time subseasonal climate forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online learning to optimistic online learning that reveals how optimistic hints can mitigate the regret penalty caused by delay. We pair this delay-as-optimism perspective with a new analysis of optimistic learning that exposes its robustness to hinting errors and a new meta-algorithm for learning effective hinting strategies in the presence of delay. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models.

Lester Mackey
Sat 4:45 p.m. - 5:00 p.m.
Contributed Talk: Ecological Inference using Constrained Kalman filters for the COVID-19 Pandemic (Contributed Talk)   
Brian Lim
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A6 in Virtual World ]

Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and prevent large-scale outbreaks. This paper presents a spatio-temporal Bayesian framework for early detection of COVID-19 hotspots (at the county level) in the United States. We assume both the observed number of cases and hotspots depend on a class of latent random variables, which encode the underlying spatio-temporal dynamics of the transmission of COVID-19. Such latent variables follow a zero-mean Gaussian process, whose covariance is specified by a non-stationary kernel function. The most salient feature of our kernel function is that deep neural networks are introduced to enhance the model's representative power while enjoying great interpretability. Our model demonstrates superior hotspot-detection performance compared to other baseline methods.

Shixiang Zhu
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A5 in Virtual World ]

At Walmart, our core mission is to help people save money so that they can live better.We accomplish this is by applying downward pressure on our prices in order to increase traffic and sales in our stores. Prior work has developed an automated process for optimal price recommendation \cite{Prs} including Bayesian Structured Time Series demand forecasting component.In this paper, we seek to extend the previous approach by incorporating robust optimization and an improved demand forecasting scheme with time-series clustering. The improved system is called Robust Price Recommendation System, or PRS+.

Linsey Pang
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A4 in Virtual World ]

Time series clustering is an important task in its own right, and often a subroutine in other higher-level algorithms. However, clustering subsequences of a time series is known to be a particularly hard problem, and it has been shown that naive clustering of subsequences yields meaningless results under common assumptions. In this work, we introduce Time2Cluster, a novel representation and accompanying algorithm that meaningfully clusters time series subsequences. Our key insight is to avoid depending solely on relative distance information between subsequences, and instead to exploit information about the neighborhood subsequences. Our algorithm uses neighborhood information to mitigate the negative effects of small variations, such as phase shift, between the subsequences of time series data.

Shima Imani
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A3 in Virtual World ]

Among the most extreme objects in the Universe, active galactic nuclei (AGN) are luminous centers of galaxies where a black hole feeds on surrounding matter. The variability patterns of the light emitted by an AGN contain information about the physical properties of the underlying black hole. Upcoming telescopes will observe over 100 million AGN in multiple broadband wavelengths, yielding a large sample of multivariate time series with long gaps and irregular sampling. We present a method that reconstructs the AGN time series and simultaneously infers the posterior probability density distribution (PDF) over the physical quantities of the black hole, including its mass and luminosity. We apply this method to a simulated dataset of 11,000 AGN and report precision and accuracy of 0.4 dex and 0.3 dex in the inferred black hole mass. This work is the first to address probabilistic time series reconstruction and parameter inference for AGN in an end-to-end fashion.

Ji Won Park
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A2 in Virtual World ]

Modeling dynamic relations between recording channels and the long-term dependencies are critical in multivariate time series. Recent approaches leverage graph neural networks to capture the direct first-order relationship between channels. While this is useful to capture co-occurrence patterns, they do not reveal indirect higher-order relationships governed by latent processes. For example, electricity consumption at consumer ends can follow similar temporal patterns, the simple correlation hides the facts that the patterns are driven by several unrecorded factors such as working activities over the day, the humidity, and the sunlight intensity – to name a few. To this end, we propose a dual message-passing recurrent neural system that disentangles the observed recording processes from the unobserved governing processes. The messages are passed in both the bottom-up and top-down manners: The bottom-up signals are aggregated to capture governing patterns, while the top-down messages augment the dynamics of low-level processes. Each process maintains its own memory of historical data, allowing process-specific long-term patterns to form. The governing process memories are jointly accessible to each other, and they collectively capture the governing dynamics of the entire system. Throughout extensive experiments on real-world time-series forecasting datasets, we prove the robustness and efficiency of our approach across different scenarios.

Duc Nguyen
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A1 in Virtual World ]

Developments in neural forecasting have shown significant improvements in the accuracy of large-scale systems, yet predicting extremely long horizons remains a challenging task. Two common problems are the volatility of the predictions and the computational complexity; we addressed them by incorporating smoothness regularization and mixed data sampling techniques to a well-performing multi-layer perceptron based architecture (NBEATS). We validate our proposed, DMIDAS, on high-frequency healthcare and electricity price data with long forecasting horizon (~1000 timestamps) where we improve the prediction accuracy by 5% over state-of-the-art models, reducing the number of parameters of NBEATS by nearly 70%.

Cristian Challu
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot A0 in Virtual World ]

We present a method for "ecological inference", learning individual-level associations from aggregate data for time series data. This problem has recently been highlighted with the COVID-19 pandemic where demographic time series data is difficult to obtain while aggregate time series data is easily obtainable. It is not unreasonable to expect at least a small amount of reported data for component time series, so our approach uses that assumption to create a transition matrix element-wise to be used in a constrained Kalman filter to be used on other aggregate time series. We consider the COVID-19 pandemic case numbers between states and regions within the states to help us estimate time series data in other states, and our results show a significant degree of accuracy to the actual case numbers in those regions.

Brian Lim
Sat 5:00 p.m. - 6:00 p.m.
[ Visit Poster at Spot B0 in Virtual World ]

Irregularly, asynchronously and sparsely sampled multivariate time series (IASS-MTS) data occur naturally in practical domains. They are characterized by sparse non-uniform time intervals between successive observations and different sampling rates amongst series. These properties pose substantial challenges to contemporary machine learning models for learning complicated intra-series and inter-series relations within and across IASS-MTS. To address these challenges, we present a time-aware Dual-Attention and Memory-Augmented Networks architecture (DAMA-Net). The proposed model aims at leveraging both time irregularity, multi-sampling rates and global temporal patterns information inherent in time series so as to learn more effective representations and improve prediction performance. We evaluate our model on two real-world datasets for IASS-MTS classification tasks. The results show that our model outperforms state-of-the-art methods in terms of classification performance. Moreover, we conduct the ablation study to demonstrate the contribution made by different mechanisms and modules in our model.

zhen wang
Sat 6:00 p.m. - 6:05 p.m.
Awards and Closing Remarks (Closing)