Paper ID: 959 Title: The Arrow of Time in Multivariate Time Series Review #1 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The authors show theoretical results indicating that if a VAR processes is time-reversible, then the distribution of the residuals is Gaussian. This indicates that one should expect that under non-Gaussian residuals, the direction of time of a VAR process can be inferred from the data. This enables also the identification of the direction of time in other processes such as the VARMA which can be expressed as a particular case of a VAR process. To infer the direction of time one simply looks for independence between the residual and the current observation. For this, the HSIC test is employed. This technique is used to determine the direction of time in synthetic time-series sampled from the assumed model and in some datasets that involve financial time series and video snippets, although the last ones were performed in a different paper and are only referred to in the present one. Clarity - Justification: The description of the theoretical results is clearly stated. However, I find confusing that the authors move directly from the following result: if a VAR process is time-reversible, then the residuals are normally distributed, and if the residuals are normally distributed, the process is time-reversible. to testing for independence between the observation and the residual to check for the direction of time. This must be better motivated in the paper. Significance - Justification: The experiments with real-world data are a bit limited in terms of results on real-world data. This is surprising given the large availability of real-world time series that are commonly described using multivariate linear models. Also, the authors only compare with a single base-line LINGAM. In the introduction they mention an approach based on Gaussianity measures, but they do not compare with that method. This limits the significance of the experimental results. It is not clear how useful the proposed approach is for practical purposes. The authors mention the usefulness of the availability of time series data with known ground truth for testing new causal methods. They should perhaps elaborate more on this in the paper to give it more significance. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): New relevant theoretical results about the reversibility of linear time-series. These results should be better connected with the proposed methodology for identifying the direction of time series. The experimental section could also be improved a bit by including more real-world results and additional baselines. Small typo: the first author name in the reference Hernández-Lobato, D., Morales-Mombiela, P., and Surez, A. Gaussianity measures for detecting the direction of causal time series. is wrong and should instead be Hernández-Lobato, J. M. ===== Review #2 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The paper shows that the time direction of VARMA model is reversible only if the innovations are Gaussian. The authors proposed am algorithm for detecting the time direction. Clarity - Justification: The paper is clearly written. The last line of the caption of Figure 2 says second row (d) and (f). Should it be (e), instead of (f) ? In page 5, should the Proof of Theorem 2.2 be Theorem 4.1 ? Significance - Justification: It is a very interesting paper that investigates the reversibility of VARMA series. The results are novel and significant. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): The paper considers the equivalence between K dimensional VARMA(p,q) process and K(p+q) dimensional VAR(1) process. Given the VARMA process, the parameters in the equivalent VAR(1) processes certain characteristics. However, it is not clear why the authors set the parameters using the form shown in section 5.1, using lambda, R and Q. Would it be more interesting to see the performance using VARMA processes ? The results show the cases of correctly classified and wrongly classified. It would be more interesting to see the distribution of the two non-decision cases. The paper did not mention what method was used in the estimation of the parameters of VAR in the forward and backward direction in the experiments. It is questionable if the parameter estimation method affects the empirical performance. ===== Review #3 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This paper proposes to learn the temporal direction of time series data under the VARMA assumption, and non-Gaussian noise terms. The paper seems like a fairly incremental extension of results of Peters et al, 2009. Clarity - Justification: The paper was fairly easy to follow. Significance - Justification: I think this is an important problem, but most of the ideas seem to be already present in Peters et al, 2009. I think it would be good to more emphasize why the extension presented in this paper is difficult. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): The authors discuss similar approaches of Shimizu et al, and point out it lacks theoretical justification due to the possibility of confounding (p. 2). I found it slightly puzzling, then, that in the very next paragraph, the authors assume away confounding themselves by assumping innovations are independent of the past. It seems to me the theoretical justification for their approach is then precisely the same as for the work of Shimizu et al. It would have been interesting to see an empirical investigation of model violations, and how they affect the decision (does the algorithm remain undecided as the authors claim on p. 5, or can the algorithm be misled about the direction). =====