We kindly thank the reviewers for their careful comments. $
Reviewer 1
 
- Typos:

We are very grateful for pointing these out. All the typos will be fixed in the final version. 
 
- Table 1:
 
Previous results show a (e+\epsilon) approximation and not e+small constant+epsilon approximation and we thank the reviewer for noticing this correction.
 
The result of e+\epsilon follows from applying results from Feldman, Naor, Schwartz 2011 to contention resolution techniques from Chekuri, Vondrák, Zenklusen 2014 (the original STOC paper is from 2011).  Feldman et al don't formalize this as a theorem but state it in introduction.

p-matroid  + l–knapsack: this result  is stated in table 1 from Chekuri, Vondrák, Zenklusen 2014.

p-system approximation ratio of (p+1)(3p+3)/p by Gupta et al: by substituting \alpha=1/2 approximation for unconstrained maximization into theorem 3.3. 

p-matroid approximation ratio of p+1+1/(p-1)+epsilon by Lee et al: Theorem 4. 
 
- How the p-matroid + l-knapsack result simplifies to the p-system and p-matriod results given at the bottom right of the table.
 
Observe that each call to IGDT in FANTOM returns the same solution when l=0 (because there is no knapsack constraint for the density to affect). Hence, as for Gupta et al for the case of l=0, we obtain (p+1)(2p+1)/p approximation guarantee with O(nrp) running time. We will add a proposition in the final version. Thanks for the suggestion.
 
- Intuition about the performance of the baselines:

Greedy picks the elements based on their marginal gains while ignoring the costs. Therefore, if elements with high marginal gains are expensive, it soon runs out of budget and cannot proceed further. On the other hand, density greedy only considers the ratio of marginal gains to their knapsack costs. As a result, cheap elements with low marginal gains may be selected. By having a matroid constraint that limits the solution size, we can get arbitrarily bad solutions for density greedy. FANTOM can tolerate both of these cases and provide a desirable solution, as can be seen in the experiments.

As an example, consider the case where we have 5 elements E={e_1, …, e_5}, such that f(e_1)=1, c_1 = B; f(e_2)=f(e_3)=1/\lambda, c_2=c_3=B/(2*\lambda); and f(e_4)=f(e_5)=1-\epsilon, c_4=c_5=B/2. Having a budget B, and matroid constraint that limits the size of the solution to 2, we get: f(greedy_solution)=1, and f(density_greedy)=2\lambda (for large \lambda, performance of the density greedy degrades arbitrary). However, FANTOM obtains a good solution of 2-2\epsilon by picking e_4, e_5.
 
- More detailed explanations in the proofs would be nice.

Thanks for the suggestion; we’ll add more explanation in the final version.
 
- I don't think a specific C is defined anywhere?

C is explicitly defined in the statement of Theorem 5.1. It is any feasible set. 
 
- What about the case where FANTOM terminates not because adding the
next item would violate any constraint, but simply because it cannot
find an item of sufficient density?

This also falls in case two of the analysis where an item was either not added due to not having sufficient density or due to p-system constraint. In there we do define the set C<\rho as the set, which consists of items with not enough density. 

- Notation suggestion: 

Thanks for the suggestion; we’ll replace \rho to improve clarity.
 
Reviewer 2

- The algorithms and the analysis is a combination of previous work by Gupta et al. ’10 and Badanidiyuru,Vondrak ’14. The paper also obtains a slight tightening of the analysis by Gupta et al. ’10, which leads to an improvement in the approximation factor.

As pointed out by Review 3 the performance of FANTOM is comparable to state-of-the-art algorithms that are specialized for a particular set of constraints. To the best of our knowledge, there is no fast and practical algorithm for maximizing general submodular functions with provable approximation guarantees under p-systems + l-knapsacks constraints. 

- All the scenarios considered can be modeled with exclusively knapsack constraints. Thus, a natural choice for comparison would be Chekuri, Jayram, Vondrak ITCS’15. The largest instance here (n < 40000) is not too big for their O(n^2) runtime.

In fact, we cannot apply Chekuri, Jayram, Vondrak ITCS'15. This is because their algorithm gives a O(n^2) runtime for the fractional solution to multi-linear extension F(x). Converting this to an integral solution f(S) requires an expensive enumeration step (as explained in the paper) which requires us to enumerate sets of size poly(1/epsilon). This results in a run time of n^(poly(1/epsilon)). 

- Line 474, 477: descritize -> discretize. The years are missing in citations Hartline et al. and Lindgren et al.

Thanks for pointing these out.

Reviewer 3

Thanks for your encouraging comments.