We appreciate all the reviewers’ comments.$Reviewer 1  
The value function and constraints used were selected by our medical co-author according to current medical practice. 
Reviewer 2 
2.a. In the revised manuscript, we will omit abstract measure-theoretic descriptions and more emphasize the Bayesian approach. 
2.b. In the offline stage, the forecaster knows the stopping time of the recorded series, which we denoted by the reference time, t=0 [line 540-550], whereas in the real-time stage the forecaster only knows the admission time; thus, we used "t" to denote the admission period [line 585-595]. Alignment is applied as follows. Training time series are all aligned by their stopping times considering as the reference time, and density estimation is applied with respect to that reference. For a currently monitored patient, an estimate for the stopping time (reference time) is computed (eq. (7)), and the estimate is refined subsequently as additional measurements are observed.  
2.c. Novelty: 1) We handle the timeliness-accuracy trade-off through a novel decision-theoretic framework by computing a dynamic posterior belief process that guides the decision making given the costs and risks of different decisions, whereas previous works on GP models, such as [Ghassemi et. al, AAAI'15] have only focused on predicting the series values without decision-theoretic considerations. 2) Unlike AR and Markov models (e.g. [ Stanculescu'14]), our model accommodates a more general correlation structure that can vary over time, which allows tracking a patient's (nonstationary) physiological trajectory that need not be representable to Markovian or a stationary AR model. 3) Our model continuously refines the alignment to the unknown reference time via eq. (7). This leads to the posterior belief process converging quickly to an accurate trajectory, which stimulates a timely decision. Continuously refined alignment was not considered in any of the previous time-series models.  
2.d. The suitability of the Gaussian model was validated by a Kolmogorov-Smirnov goodness-of-fit test (see Fig. 7 in supplementary material). We do not assume that time instances are independent and the covariance matrix estimation in Eq. (6) augments a window of temporal measurements that captures temporal correlations. The constant latent class assumption is adopted since assuming that a patient switches her class at some time steps will convert the problem into a "change-point detection" problem. Training a model (e.g. Shiryaev-Roberts statistic) to detect class changes-points requires labeled change-points in the training set (which is not clinically recorded). The reason why a constant class assumption is effective is that we condition the belief process on a finite window of temporal data thus a class change can be detected by having data from the new class residing in the most recent temporal window. 
2.e. Benchmarks sequentially classify the ICU patients as additional measurements are observed over time. The feature values are normalized by the maximum value of each physiological streams. Feature selection was applied to the benchmarks. The benchmarks only used the current values of the physiological streams; thus, the window size for feature extraction is 1. The hyper-parameters were empirically optimized based on eq. (9) for all benchmarks. 
Reviewer 3 
3.a. In a revised manuscript, we will fix all the typos (including to the typo in the Q function pointed out by the reviewer), and we will re-structure the paper. For details of the benchmarks, fairness of comparisons, and solution of eq. (9), kindly refer to response (2.e.) to reviewer 2. 
3.b. It is not true that the performance of the algorithm is just above chance. The rate of ICU-admitted patients in the dataset is only around 10%. Thus random guessing can only discover 10% of clinically decompensated patients. Moreover, in order to assess the net clinical utility of the algorithm, the medical co-authors of this paper asked us to rule out all our correct predictions that are prompted less than 4 hours before the actual clinicians' ICU admission decision. If all our correct predictions that are just-in-time are counted, the performance would be boosted to 85% TPR with 63% PPV instead of 54% TPR with 50% PPV. In either case, the performance is significantly better than random guessing and better than the early warning systems that are currently deployed in hospitals such as MEWS and the Rothman index (our approach leads to 22.3% and 34.7% gains with respect to MEWS and Rothman index). This will be clearly stated in the revised manuscript. 
3.c. A discussion of the algorithm complexity will be added in section 4. In the offline stage, in which the complexity is less crucial, the computational complexity is O(D^3*W^3+N^2), where N is the number of patients, D is the number of features, and W is the window size. In the real-time stage, the computational complexity is O(N*D^2*W^2).