Timezone: »
Time series is one of the fastest growing and richest types of data. In a variety of domains including dynamical systems, healthcare, climate science and economics, there have been increasing amounts of complex dynamic data due to a shift away from parsimonious, infrequent measurements to nearly continuous real-time monitoring and recording. This burgeoning amount of new data calls for novel theoretical and algorithmic tools and insights.
The goals of our workshop are to: (1) highlight the fundamental challenges that underpin learning from time series data (e.g. covariate shift, causal inference, uncertainty quantification), (2) discuss recent developments in theory and algorithms for tackling these problems, and (3) explore new frontiers in time series analysis and their connections with emerging fields such as causal discovery and machine learning for science. In light of the recent COVID-19 outbreak, we also plan to have a special emphasis on non-stationary dynamics, causal inference, and their applications to public health at our workshop.
Time series modeling has a long tradition of inviting novel approaches from many disciplines including statistics, dynamical systems, and the physical sciences. This has led to broad impact and a diverse range of applications, making it an ideal topic for the rapid dissemination of new ideas that take place at ICML. We hope that the diversity and expertise of our speakers and attendees will help uncover new approaches and break new ground for these challenging and important settings. Our previous workshops have received great popularity at ICML, and we envision our workshop will continue to appeal to the ICML audience and stimulate many interdisciplinary discussions.
Gather.Town:
Morning Poster: [ protected link dropped ]
Afternoon Poster: [ protected link dropped ]
Sat 8:45 a.m. - 9:00 a.m.
|
Openning Remarks
(
Introduction
)
SlidesLive Video » |
🔗 |
Sat 9:00 a.m. - 9:45 a.m.
|
Mihaela Van der Schaar: Time-series in healthcare: challenges and solutions
(
Invited Talk
)
SlidesLive Video » |
Mihaela van der Schaar 🔗 |
Sat 9:45 a.m. - 10:30 a.m.
|
Mike West: Multiscale Bayesian Modelling: Ideas and Examples from Consumer Sales
(
Invited Talk
)
SlidesLive Video » Bayesian multiscale models exploit variants of the “decouple/recouple'' concept to enable advances in forecasting and monitoring of increasingly large-scale time series. Recent and current applications include financial and commercial forecasting, as well as dynamic network studies. I overview some recent developments via examples from applications in large-scale consumer demand and sales forecasting with intersecting marketing related goals. Two coupled applied settings involve (a) models for forecasting daily sales of each of many items in every supermarket of a large national chain, and (b) models for understanding and forecasting customer/household-specific purchasing behavior to informs decisions about personalized pricing and promotions on a continuing basis. The multiscale concept is applied in each setting to define new classes of hierarchical Bayesian state-space models customized to the application. In each area, micro-level, individual time series are represented via customized model forms that also involve aggregate-level factors, the latter being modelled and forecast separately. The implied conditional decoupling of many time series enables computational scalability, while the effects of shared multiscale factors define recoupling to appropriately reflect cross-series dependencies. The ideas are of course relevant to other applied settings involving large-scale, hierarchically structured time series. |
🔗 |
Sat 10:30 a.m. - 10:45 a.m.
|
Morning Coffee Break
|
🔗 |
Sat 10:45 a.m. - 11:00 a.m.
|
Contributed Talk: JKOnet: Proximal Optimal Transport Modeling of Population Dynamics
(
Contributed Talk
)
SlidesLive Video » |
Charlotte Bunne 🔗 |
Sat 11:00 a.m. - 11:40 a.m.
|
Dominik Janzing: Quantifying causal influence in time series and beyond
(
Invited Talk
)
SlidesLive Video » Quantification of causal influence is a non-trivial conceptual problem. Well-known concepts like Granger causality and transfer entropy are arguably correct to detect the presence of causal influence (subject to assumptions like causal sufficiency and positive probability density), but following [2] I argue that taking them as measure for the strength of causal influence is conceptually flawed. To discuss this, I consider the more general question of quantifying the strength of an edge (or a set of edges) in a causal DAG. I describe a few postulates that we [1] would expect from a measure of causal influence and describe the information theoretic casual strength that we proposed in [1]. Reference: [1] D. Janzing, D. Balduzzi, M. Grosse-Wentrup, B. Schölkopf: Quantifying causal influences. Annals of Statistics, 2013. [2] N. Ay and D. Polani: Information flow in causal networks, 2008. |
Dominik Janzing 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Changepoint Detection using Self Supervised Variational AutoEncoders
(
Poster
)
Changepoint Detection methods aim to find locations where a time series shows abrupt changes in properties, such as level and trend, which persist with time. Traditional parametric approaches assume specific generative models for each segment of the time series, but often, the complexities of real time series data are hard to capture with such models. To address these issues, in this paper, we propose VAE-CP, which uses a variational autoencoder with self supervised loss functions to learn informative latent representations of time series segments. We use traditional hypothesis test based and Bayesian changepoint methods in this latent space of normally distributed latent variables, thus combining the strength of self-supervised representation learning, with parametric changepoint modeling. This proposed approach outperforms traditional and previous deep learning based changepoint detection methods in synthetic and real datasets containing trend changes. |
Sourav Chatterjee 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: First Hitting Time Guarantees for Nonlinear Time Series Models
(
Poster
)
We derive tight probabilistic bounds on the first hitting time of general classes of contractive nonlinear time series models that can be linked to mean reverting processes. As an application to finance, we translate our results to a pairs trading strategy with probabilistic guarantees on its returns. |
Julien Huang 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Revisiting Dynamic Regret of Strongly Adaptive Methods
(
Poster
)
link »
We consider the framework of non-stationary Online Convex Optimization where a learner seeks to control its \emph{dynamic regret} against an \emph{arbitrary} sequence of comparators. When the loss functions are strongly convex or exp-concave, we demonstrate that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret in terms of \emph{path variation} $V_T$ \emph{of the comparator sequence}. Specifically, we show that SA algorithms enjoy $\tilde O(\sqrt{TV_T} \vee \log T)$ and $\tilde O(\sqrt{dTV_T} \vee d\log T)$ dynamic regret for strongly convex and exp-concave losses respectively \emph{without} apriori knowledge of $V_T$, thus answering an open question in \cite{zhang2018dynamic}. The versatility of the principled approach is further demonstrated by the novel results in the setting of learning against bounded linear predictors and online regression with Gaussian kernels.
|
Dheeraj Baby 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Flexible Temporal Point Processes Modeling with Nonlinear Hawkes Processes with Gaussian Processes Excitations and Inhibitions
(
Poster
)
link »
We propose an extended Hawkes process model where the self--effects are of both excitatory and inhibitory type and follow a Gaussian Process. Whereas previous work either relies on a less flexible parameterization of the model, or requires a large amount of data, our formulation allows for both a flexible model and learning when data are scarce. Efficient approximate Bayesian inference is achieved via data augmentation, and we describe a mean--field variational inference approach to learn the model parameters. To demonstrate the flexibility of the model we apply our methodology on data from two different domains and compare it to previously reported results. |
Noa Malem-Shinitski 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Probabilistic Time Series Forecasting with Implicit Quantile Networks
(
Poster
)
Here, we propose a general method for probabilistic time series forecasting. We combine an autoregressive recurrent neural network to model temporal dynamics with Implicit Quantile Networks to learn a large class of distributions over a time-series target. When compared to other probabilistic neural forecasting models on real- and simulated data, our approach is favorable in terms of point-wise prediction accuracy as well as on estimating the underlying temporal distribution. |
Adele Gouttes 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morining Poster Session: Online Learning with Optimism and Delay
(
Poster
)
link »
Inspired by the demands of real-time time-series forecasting, we develop and analyze optimistic online learning algorithms under delayed feedback. We present a novel "delay as optimism" analysis that reduces online learning under delay to optimistic online learning. This reduction enables optimal regret bounds for delayed online learning and exposes how side-information or optimistic "hints" can be used to combat the effects of delay. We use these theoretical tools to develop the first optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delay. These algorithms --- DORM, DORM+, and AdaHedgeD --- are robust and practical choices for real-world time-series forecasting. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models. |
Genevieve Flaspohler 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Evolving-Graph Gaussian Processes
(
Poster
)
Graph Gaussian Processes (GGPs) provide a data-efficient solution on graph structured domains. Existing approaches have focused on static structures, whereas many real graph data represent a dynamic structure, limiting the applications of GGPs. To overcome this we propose evolving-Graph Gaussian Processes (e-GGPs). The proposed method is capable of learning the transition function of graph vertices over time with a neighbourhood kernel to model the connectivity and interaction changes between vertices. We assess the performance of our method on time-series regression problems where graphs evolve over time. We demonstrate the benefits of e-GGPs over static graph Gaussian Process approaches. |
David Blanco-Mulero 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: VIKING: Variational Bayesian Variance Tracking Winning a Post-Covid Day-Ahead Electricity Load Forecasting Competition
(
Poster
)
We present a novel variational bayesian approach for time series forecasting following from a state-space representation, named VIKING (Variational BayesIan Variance TracKING. The method is illustrated with the procedure used to win a recent competition on post-covid electricity load forecasting. |
Joseph de Vilmarest 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Towards Robust, Scalable and Interpretable Time Series Forecasting using Bayesian Vector Auto-Regression
(
Poster
)
We present a flexible, scalable, and interpretable framework for automated forecasting of multivariate time-series, building off of the Bayesian Vector Autoregression (BVAR) literature in macroeconometrics. Our algorithm allows for full posterior estimates of hundreds of interaction parameters, with minimal hand-tuning or hyperparameter specification required. The model can be easily extended to account for non-stationary breaks such as the COVID-19 pandemic. In experiments our model outperforms comparably-flexible time-series models at forecasting inflation. |
Rishab Guha 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Prediction-Constrained Hidden Markov Models for Semi-Supervised Classification
(
Poster
)
We develop a new framework for training hidden Markov models that balances generative and discriminative goals. Our approach requires likelihood-based or Bayesian learning to meet task-specific prediction quality constraints, preventing model misspecification from leading to poor subsequent predictions. When users specify an appropriate loss function to constrain predictions, our approach can enhance semi-supervised learning when labeled sequences are rare and boost accuracy when data has unbalanced label frequencies. Via automatic differentiation we backpropagate gradients through dynamic programming computation of the marginal likelihood, making training feasible without auxiliary bounds or approximations. Our approach is effective for human activity modeling and healthcare intervention forecasting, delivering accuracies competitive with well-tuned neural networks for fully labeled data, and substantially better for partially labeled data. Simultaneously, our learned generative model illuminates the dynamical states driving predictions. |
Gabriel Hope 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Continuous Latent Process Flows
(
Poster
)
Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits, including the ability to generate continuous trajectories and to perform inference on previously unseen time stamps. Despite exciting progress in this area, the existing models still face challenges in terms of their representational power and the quality of their variational approximations. We tackle these challenges with continuous latent process flows (CLPF), a principled architecture decoding continuous latent processes into continuous observable processes using a time-dependent normalizing flow driven by a stochastic differential equation. To optimize our model using maximum likelihood, we propose a novel piecewise construction of a variational posterior process and derive the corresponding variational lower bound using trajectory re-weighting. Our model shows favourable performance on synthetic data simulated from stochastic processes. |
Ruizhi Deng 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series
(
Poster
)
Realistic synthetic time series data of sufficient length enables practical applications in time series modeling tasks, such as forecasting, but remains to be a challenge. In this paper we present PSA-GAN, a generative adversarial network (GAN) that generates long time series samples of high quality using progressive growing of GANs and self-attention. We show that PSA-GAN can be used to reduce the error in two downstream forecasting tasks over baselines that only use real data. We also introduce a Frechet-Inception Distance-like score, Context-FID, assessing the quality of synthetic time series samples. In our downstream tasks, we find that this score is able to predict the best-performing models and could therefore be a useful tool to develop time series GAN models for downstream use. |
Paul Jeha · Pedro Mercado 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Modeling the El Niño Southern Oscillation with Neural Differential Equations
(
Poster
)
We use a Neural Ordinary Differential Equation to model and predict the seasonal to interannual variability of El Niño Southern Oscillation (ENSO). We train our neural network model using partial observations involving only sea surface temperature data. Our approach is computationally inexpensive, it reproduces the main seasonal features of ENSO, and exhibits robust predictions skills. |
Ludovico Giorgini 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Forward Prediction for Physical Reasoning
(
Poster
)
Physical reasoning requires forward prediction: the ability to forecast what will happen next given some initial world state. We study the performance of state-of-the-art forward-prediction models in the complex physical-reasoning tasks of the PHYRE benchmark (Bakhtin et al., 2019). We do so by incorporating models that operate on object or pixel-based representations of the world into simple physical-reasoning agents. We find that forward-prediction models can improve physical-reasoning performance, particularly on complex tasks that involve many objects. However, we also find that these improvements are contingent on the test tasks being small variations of train tasks, and that generalization to completely new task templates is challenging. Surprisingly, we observe that forward predictors with better pixel accuracy do not necessarily lead to better physical-reasoning performance. Nevertheless, our best models set a new state-of-the-art on the PHYRE benchmark. Our code and models will be released online. |
Rohit Girdhar 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Temporal Dependencies in Feature Importance for Time Series Predictions
(
Poster
)
Explanation methods applied to sequential models for multivariate time series prediction are receiving more attention in machine learning literature. While current methods perform well at providing instance-wise explanations, they struggle to efficiently and accurately make attributions over long periods of time and with complex feature interactions. We propose WinIT, a framework for evaluating feature importance in time series prediction settings by quantifying the shift in predictive distribution over multi-instance predictions in a windowed setting. Comprehensive empirical evidence shows our method improves on the previous state-of-the-art, FIT, by capturing temporal dependencies in feature importance. We also demonstrate how the solution improves the appropriate attribution of features within time steps, which existing interpretability methods often fail to do. We compare with baselines on simulated and real-world clinical data. WinIT achieves 2.04x better performance than FIT and other feature importance methods on real-world data. |
Clayton Rooke 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Deep Signature Statistics for Likelihood-free Time-series Models
(
Poster
)
Simulation-based inference (SBI) has emerged as a family of methods for performing inference on complex simulation models with intractable likelihood functions. A common bottleneck in SBI is the construction of low-dimensional summary statistics of the data. In this respect, time-series data, often being high-dimensional, multivariate,and complex in structure, present a particular challenge. To address this we introduce deep signature statistics, a principled and automated method for combining summary statistic selection for time-series data with neural SBI methods. Our approach leverages deep signature transforms, trained concurrently with a neural density estimator, to produce informative statistics for multivariate sequential data that encode important geometric properties of the underlying path. We obtain competitive results across benchmark models. |
Joel Dyer 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Electric Load Forecasting with Boosting based Sample Transfer
(
Poster
)
With the increasing adoption of renewable energy generation and different types of electric devices, electric load forecasting, especially short-term load forecasting (STLF), is attracting more and more attention. Accurate short-term load forecasting is of significant importance for the safety and efficiency of power grids. Deep learning based models have shown impressive success on several applications including short-term load forecasting. However, for several real-world scenarios, it may be very difficult or even impossible to collect enough training data to learn a reliable machine learning model. Specifically, we first proposed In this paper, we propose an instance transfer-based transfer learning algorithm to assist the learning performance for the short-term load forecasting. The proposed algorithm is evaluated on several real-world data sets and has shown significant improvements over the baselines. |
Tracy Cui 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: JKOnet: Proximal Optimal Transport Modeling of Population Dynamics
(
Poster
)
Consider a heterogeneous population of points evolving with time. While the population evolves, both in size and nature, we can observe it periodically, through snapshots taken at different timestamps. Each of these snapshots is formed by sampling points from the population at that time, and then creating features to recover point clouds. While these snapshots describe the population's evolution on aggregate, they do not provide directly insights on individual trajectories. This scenario is encountered in several applications, notably single-cell genomics experiments, tracking of particles or crowd monitoring. In this paper, we propose to model that dynamic as resulting from the celebrated Jordan-Kinderlehrer-Otto (JKO) proximal scheme. The JKO scheme posits that the configuration taken by a population at time t is one that trades off a decrease w.r.t. an energy (the model we seek to learn) penalized by an optimal transport distance w.r.t. the previous configuration. To that end, we propose JKOnet, a neural architecture that combines an energy model on measures, with (small) optimal displacements solved with input convex neural networks (ICNN). We demonstrate the applicability of our model to explain and predict population dynamics. |
Charlotte Bunne 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: An efficient Gaussian process framework for analysis of oscillations in nonstationary time series
(
Poster
)
We propose Piecewise Locally Stationary Oscillation (PLSO) state-space model for decomposing nonstationary time series with slowly time-varying spectra into several oscillatory, piecewise-stationary processes. PLSO combines piecewise stationarity in classical signal processing and stationary Gaussian process kernels, effectively addressing the drawbacks of these ideas, such as inefficient inference and discontinuous/distorted estimates across stationary interval boundaries. |
Andrew Song 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Understanding Local Linearisation in Variational Gaussian Process State Space Models
(
Poster
)
We describe variational inference approaches in Gaussian process state space models in terms of local linearisations of the approximate posterior function. Most previous approaches have either assumed independence between the posterior dynamics and latent states (the mean-field (MF) approximation), or optimised free parameters for both, leading to limited scalability. We use our framework to prove that (i) there is a theoretical imperative to use non-MF approaches, to avoid excessive bias in the process noise hyperparameter estimate, and (ii) we can parameterise only the posterior dynamics without any less of performance. Our approach suggests further approximations, based on the existing rich literature on filtering and smoothing for nonlinear systems, and unifies approaches for discrete and continuous time models. |
Talay Cheema 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Inferring the Structure of Ordinary Differential Equations
(
Poster
)
Understanding physical phenomena oftentimes means understanding the underlying dynamical system that governs observational measurements. While accurate prediction can be achieved with black box systems, they often lack interpretability and are less amenable for further expert investigation. Alternatively, the dynamics can be analysed via symbolic regression. In this paper, we extend the approach by (Udrescu et al., 2020) called AI Feynman to the dynamic setting to perform symbolic regression on ODE systems based on observations from the resulting trajectories. We compare this extension to state-of-the-art approaches for symbolic regression empirically on several dynamical systems for which the ground truth equations of increasing complexity are available. Although the proposed approach performs best on this benchmark, we observed difficulties of all the compared symbolic regression approaches on more complex systems, such as Cart-Pole. |
Juliane Weilbach 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Recurrent Intensity Modeling for User Recommendation and Online Matching
(
Poster
)
Many applications such as recommender systems (RecSys) are built upon streams of events, each associated with a type in a large-cardinality set and a timestamp in the continuous domain. To date, most applied work is focused on the prediction of the type of the next event, i.e., which exact item a user may visit when they arrive at the RecSys. Instead, we aim to predict when and how often an event of a certain type will be visited by the given user, without the implicit assumption that they will arrive and consume exactly one item at a time. This perspective leads to unique applications in user recommendation (UserRec), where the RecSys is tasked to preemptively match users on behalf of the item producers for marketing purposes. We propose Recurrent Intensity Models (RIMs) that incorporate user visitation intensities in the RecSys, based on recent progress in temporal processes. To our knowledge, our work is the first to approach UserRec completely based on hidden temporal representations without heuristics from explicit feature engineering. |
Yifei Ma 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Integrating LSTMs and GNNs for COVID-19 Forecasting
(
Poster
)
The spread of COVID-19 has coincided with the rise of Graph Neural Networks (GNNs), leading to several studies proposing their use to better forecast the evolution of the pandemic. Many such models also include Long Short Term Memory (LSTM) networks, a common tool for time series forecasting. In this work, we further investigate the integration of these two methods by implementing GNNs within the gates of an LSTM and exploiting spatial information. In addition, we introduce a skip connection which proves critical to jointly capture the spatial and temporal patterns in the data. We validate our daily COVID-19 new cases forecast model on data of 37 European nations for the last 472 days and show superior performance compared to state-of-the-art graph time series models based on mean absolute scaled error (MASE). This area of research has important applications to policy-making and we analyze its potential for pandemic resource control. |
Nathan J Sesti 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Learning who is in the market from time series: market participant discovery through adversarial calibration of multi-agent simulators.
(
Poster
)
In electronic trading markets often only the price or volume time series, that result from interaction of multiple market participants, are directly observable. In order to test trading strategies before deploying them to real-time trading, multi-agent market environments calibrated so that the time series that result from interaction of simulated agents resemble historical are often used. To ensure adequate testing, one must test trading strategies in a variety of market scenarios -- which includes both scenarios that represent ordinary market days as well as stressed markets (most recently observed due to the beginning of COVID pandemic). In this paper, we address the problem of multi-agent simulator parameter calibration to allow simulator capture characteristics of different market regimes. We propose a novel two-step method to train a discriminator that is able to distinguish between “real” and “fake” price and volume time series as a part of GAN with self-attention, and then utilize it within an optimization framework to tune parameters of a simulator model with known agent archetypes to represent a market scenario. We conclude with experimental results that demonstrate effectiveness of our method. |
Victor Storchan · Svitlana Vyetrenko 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: A Study of Joint Graph Inference and Forecasting
(
Poster
)
We study a recent class of models which uses graph neural networks (GNNs) to improve forecasting in multivariate time series. The core assumption behind these models is that there is a latent graph between the time series (nodes) that governs the evolution of the multivariate time series. By parameterizing a graph in a differentiable way, the models aim to improve forecasting quality. We compare four recent models of this class on the forecasting task. Further, we perform ablations to study their behavior under changing conditions, e.g., when disabling the graph-learning modules and providing the ground-truth relations instead. Based on our findings, we propose novel ways of combining the existing architectures. |
Daniel Zügner · Francois-Xavier Aubet 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: Monte Carlo EM for Deep Time Series Anomaly Detection
(
Poster
)
Time series data are often corrupted by outliers or other kinds of anomalies. Identifying the anomalous points can be a goal on its own (anomaly detection), or a means to improving performance of other time series tasks (e.g.\ forecasting). Recent deep-learning-based approaches to anomaly detection and forecasting commonly assume that the proportion of anomalies in the training data is small enough to ignore, and treat the unlabeled data as coming from the nominal data distribution. We present a simple yet effective technique for augmenting existing time series models so that they explicitly account for anomalies in the training data. By augmenting the training data with a latent anomaly indicator variable whose distribution is inferred while training the underlying model using Monte Carlo EM, our method simultaneously infers anomalous points while improving model performance on nominal data. We demonstrate the effectiveness of the approach by combining it with a simple feed-forward forecasting model. We investigate how anomalies in the train set affect the training of forecasting models, which are commonly used for time series anomaly detection, and show that our method improves the training of the model. |
Francois-Xavier Aubet 🔗 |
Sat 11:45 a.m. - 12:45 p.m.
|
Morning Poster Session: ST-DETR: Spatio-Temporal Object Traces Attention Detection Transformer
(
Poster
)
We propose ST-DETR, a Spatio-TemporalTransformer-based architecture for object detec-tion from a sequence of temporal frames. We treatthe temporal frames as sequences in both spaceand time and employ the full attention mecha-nisms to take advantage of the features correla-tions over both dimensions. This treatment en-ables us to deal with frames sequence as temporalobject features traces over every location in thespace. We explore two possible approaches; theearly spatial features aggregation over the tempo-ral dimension, and the late temporal aggregationof object query spatial features. Moreover, wepropose a novel Temporal Positional Embeddingtechnique to encode the time sequence informa-tion. To evaluate our approach, we choose theMoving Object Detection (MOD) task, since it isa perfect candidate to showcase the importance ofthe temporal dimension. Results show a signifi-cant 5% mAP improvement on the KITTI MODdataset over the 1-step spatial baseline. |
Eslam Mohamed Abd El Rahman 🔗 |
Sat 2:30 p.m. - 2:45 p.m.
|
Contributed Talk: PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series
(
Contributed Talk
)
SlidesLive Video » |
Paul Jeha 🔗 |
Sat 2:45 p.m. - 3:25 p.m.
|
David Duvenaud
(
Invited Talk
)
SlidesLive Video » |
David Duvenaud 🔗 |
Sat 3:25 p.m. - 3:30 p.m.
|
David Duvenaud: Live Q&A
(
Live Q&A
)
|
🔗 |
Sat 3:30 p.m. - 3:45 p.m.
|
Afternoon Coffee Break
|
🔗 |
Sat 3:45 p.m. - 4:00 p.m.
|
Contributed Talk: Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
(
Contributed Talk
)
SlidesLive Video » |
Shixiang Zhu 🔗 |
Sat 4:00 p.m. - 4:45 p.m.
|
Lester Mackey: Online Learning with Optimism and Delay
(
Invited Talk
)
SlidesLive Video » Inspired by the demands of real-time subseasonal climate forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online learning to optimistic online learning that reveals how optimistic hints can mitigate the regret penalty caused by delay. We pair this delay-as-optimism perspective with a new analysis of optimistic learning that exposes its robustness to hinting errors and a new meta-algorithm for learning effective hinting strategies in the presence of delay. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models. |
Lester Mackey 🔗 |
Sat 4:45 p.m. - 5:00 p.m.
|
Contributed Talk: Ecological Inference using Constrained Kalman filters for the COVID-19 Pandemic
(
Contributed Talk
)
SlidesLive Video » |
Brian Lim 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
(
Poster
)
Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and prevent large-scale outbreaks. This paper presents a spatio-temporal Bayesian framework for early detection of COVID-19 hotspots (at the county level) in the United States. We assume both the observed number of cases and hotspots depend on a class of latent random variables, which encode the underlying spatio-temporal dynamics of the transmission of COVID-19. Such latent variables follow a zero-mean Gaussian process, whose covariance is specified by a non-stationary kernel function. The most salient feature of our kernel function is that deep neural networks are introduced to enhance the model's representative power while enjoying great interpretability. Our model demonstrates superior hotspot-detection performance compared to other baseline methods. |
Shixiang Zhu 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: Robust Price Optimization in Retail
(
Poster
)
At Walmart, our core mission is to help people save money so that they can live better.We accomplish this is by applying downward pressure on our prices in order to increase traffic and sales in our stores. Prior work has developed an automated process for optimal price recommendation \cite{Prs} including Bayesian Structured Time Series demand forecasting component.In this paper, we seek to extend the previous approach by incorporating robust optimization and an improved demand forecasting scheme with time-series clustering. The improved system is called Robust Price Recommendation System, or PRS+. |
Linsey Pang 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: Time2Cluster: Clustering Time Series Using Neighbor Information
(
Poster
)
Time series clustering is an important task in its own right, and often a subroutine in other higher-level algorithms. However, clustering subsequences of a time series is known to be a particularly hard problem, and it has been shown that naive clustering of subsequences yields meaningless results under common assumptions. In this work, we introduce Time2Cluster, a novel representation and accompanying algorithm that meaningfully clusters time series subsequences. Our key insight is to avoid depending solely on relative distance information between subsequences, and instead to exploit information about the neighborhood subsequences. Our algorithm uses neighborhood information to mitigate the negative effects of small variations, such as phase shift, between the subsequences of time series data. |
Shima Imani 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: Inferring Black Hole Properties from Astronomical Multivariate Time Series with Bayesian Attentive Neural Processes
(
Poster
)
Among the most extreme objects in the Universe, active galactic nuclei (AGN) are luminous centers of galaxies where a black hole feeds on surrounding matter. The variability patterns of the light emitted by an AGN contain information about the physical properties of the underlying black hole. Upcoming telescopes will observe over 100 million AGN in multiple broadband wavelengths, yielding a large sample of multivariate time series with long gaps and irregular sampling. We present a method that reconstructs the AGN time series and simultaneously infers the posterior probability density distribution (PDF) over the physical quantities of the black hole, including its mass and luminosity. We apply this method to a simulated dataset of 11,000 AGN and report precision and accuracy of 0.4 dex and 0.3 dex in the inferred black hole mass. This work is the first to address probabilistic time series reconstruction and parameter inference for AGN in an end-to-end fashion. |
Ji Won Park 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: High-Order Representation Learning for Multivariate Time Series Forecasting
(
Poster
)
Modeling dynamic relations between recording channels and the long-term dependencies are critical in multivariate time series. Recent approaches leverage graph neural networks to capture the direct first-order relationship between channels. While this is useful to capture co-occurrence patterns, they do not reveal indirect higher-order relationships governed by latent processes. For example, electricity consumption at consumer ends can follow similar temporal patterns, the simple correlation hides the facts that the patterns are driven by several unrecorded factors such as working activities over the day, the humidity, and the sunlight intensity – to name a few. To this end, we propose a dual message-passing recurrent neural system that disentangles the observed recording processes from the unobserved governing processes. The messages are passed in both the bottom-up and top-down manners: The bottom-up signals are aggregated to capture governing patterns, while the top-down messages augment the dynamics of low-level processes. Each process maintains its own memory of historical data, allowing process-specific long-term patterns to form. The governing process memories are jointly accessible to each other, and they collectively capture the governing dynamics of the entire system. Throughout extensive experiments on real-world time-series forecasting datasets, we prove the robustness and efficiency of our approach across different scenarios. |
Duc Nguyen 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: DMIDAS: Deep Mixed Data Sampling Regression for Long Multi-Horizon Time Series Forecasting
(
Poster
)
Developments in neural forecasting have shown significant improvements in the accuracy of large-scale systems, yet predicting extremely long horizons remains a challenging task. Two common problems are the volatility of the predictions and the computational complexity; we addressed them by incorporating smoothness regularization and mixed data sampling techniques to a well-performing multi-layer perceptron based architecture (NBEATS). We validate our proposed, DMIDAS, on high-frequency healthcare and electricity price data with long forecasting horizon (~1000 timestamps) where we improve the prediction accuracy by 5% over state-of-the-art models, reducing the number of parameters of NBEATS by nearly 70%. |
Cristian Challu 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: Ecological Inference using Constrained Kalman filters for the COVID-19 Pandemic
(
Poster
)
We present a method for "ecological inference", learning individual-level associations from aggregate data for time series data. This problem has recently been highlighted with the COVID-19 pandemic where demographic time series data is difficult to obtain while aggregate time series data is easily obtainable. It is not unreasonable to expect at least a small amount of reported data for component time series, so our approach uses that assumption to create a transition matrix element-wise to be used in a constrained Kalman filter to be used on other aggregate time series. We consider the COVID-19 pandemic case numbers between states and regions within the states to help us estimate time series data in other states, and our results show a significant degree of accuracy to the actual case numbers in those regions. |
Brian Lim 🔗 |
Sat 5:00 p.m. - 6:00 p.m.
|
Afternoon Poster Session: DAMA-Net: A Novel Predictive Model for Irregularly Asynchronously andSparsely Sampled Multivariate Time Series
(
Poster
)
Irregularly, asynchronously and sparsely sampled multivariate time series (IASS-MTS) data occur naturally in practical domains. They are characterized by sparse non-uniform time intervals between successive observations and different sampling rates amongst series. These properties pose substantial challenges to contemporary machine learning models for learning complicated intra-series and inter-series relations within and across IASS-MTS. To address these challenges, we present a time-aware Dual-Attention and Memory-Augmented Networks architecture (DAMA-Net). The proposed model aims at leveraging both time irregularity, multi-sampling rates and global temporal patterns information inherent in time series so as to learn more effective representations and improve prediction performance. We evaluate our model on two real-world datasets for IASS-MTS classification tasks. The results show that our model outperforms state-of-the-art methods in terms of classification performance. Moreover, we conduct the ablation study to demonstrate the contribution made by different mechanisms and modules in our model. |
zhen wang 🔗 |
Sat 6:00 p.m. - 6:05 p.m.
|
Awards and Closing Remarks
(
Closing
)
SlidesLive Video » |
🔗 |
Author Information
Yian Ma (UCSD)
Ehi Nosakhare (Microsoft)
Yuyang Wang (Amazon)
Scott Yang (D. E. Shaw & Co.)
Rose Yu (UC San Diego)
More from the Same Authors
-
2023 Poster: Underspecification Presents Challenges for Credibility in Modern Machine Learning »
Alexander D'Amour · Katherine Heller · Dan Moldovan · Ben Adlam · Babak Alipanahi · Alex Beutel · Christina Chen · Jonathan Deaton · Jacob Eisenstein · Matthew Hoffman · Farhad Hormozdiari · Neil Houlsby · Shaobo Hou · Ghassen Jerfel · Alan Karthikesalingam · Mario Lucic · Yian Ma · Cory McLean · Diana Mincu · Akinori Mitani · Andrea Montanari · Zachary Nado · Vivek Natarajan · Christopher Nielson · Thomas F. Osborne · Rajiv Raman · Kim Ramasamy · Rory sayres · Jessica Schrouff · Martin Seneviratne · Shannon Sequeira · Harini Suresh · Victor Veitch · Maksym Vladymyrov · Xuezhi Wang · Kellie Webster · Steve Yadlowsky · Taedong Yun · Xiaohua Zhai · D. Sculley -
2022 Poster: Domain Adaptation for Time Series Forecasting via Attention Sharing »
Xiaoyong Jin · Youngsuk Park · Danielle Robinson · Hao Wang · Yuyang Wang -
2022 Spotlight: Domain Adaptation for Time Series Forecasting via Attention Sharing »
Xiaoyong Jin · Youngsuk Park · Danielle Robinson · Hao Wang · Yuyang Wang -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Workshop: ICML 2019 Time Series Workshop »
Vitaly Kuznetsov · Scott Yang · Rose Yu · Cheng Tang · Yuyang Wang -
2019 Poster: Online Learning with Sleeping Experts and Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2019 Oral: Online Learning with Sleeping Experts and Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2018 Poster: Online Learning with Abstention »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2018 Oral: Online Learning with Abstention »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2017 Workshop: Time Series Workshop »
Vitaly Kuznetsov · Yan Liu · Scott Yang · Rose Yu -
2017 Poster: AdaNet: Adaptive Structural Learning of Artificial Neural Networks »
Corinna Cortes · Xavi Gonzalvo · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang -
2017 Talk: AdaNet: Adaptive Structural Learning of Artificial Neural Networks »
Corinna Cortes · Xavi Gonzalvo · Vitaly Kuznetsov · Mehryar Mohri · Scott Yang