CORD-19:6e2db98aa242e3e84116afdfad5e250943c55ac5 JSONTXT 12 Projects

Annnotations TAB TSV DIC JSON TextAE

Id Subject Object Predicate Lexical cue
T1 0-69 Epistemic_statement denotes Early detection of dynamic harmful cascades in large-scale networks ଝ
T2 80-217 Epistemic_statement denotes Quickly detecting harmful cascades in networks can allow us to analyze the causes and prevent further spreading of destructive influence.
T3 371-567 Epistemic_statement denotes However, the harmful cascades are usually dynamic (e.g., the cascade initiators and diffusion trajectories can change over the time), which can severely destroy the robustness of selected sensors.
T4 673-830 Epistemic_statement denotes Motivated by the observation, in this paper we investigate the scalable sensor selection problem for early detection of dynamic harmful cascades in networks.
T5 1304-1436 Epistemic_statement denotes Based on that, we propose an efficient upper bound based greedy (UBG) algorithm with the theoretical performance guarantee reserved.
T6 1881-2038 Epistemic_statement denotes A contagious disease like Severe Acute Respiratory Syndrome (SARS) can spread quickly through a population contact network and lead to an epidemic [12, 34] .
T7 2039-2162 Epistemic_statement denotes A computer virus on a few servers can fast spread to other servers or computers in a computer connection network [15, 17] .
T8 2163-2281 Epistemic_statement denotes In a similar vein, a rumor started by a few individuals can spread quickly through the online social network [2, 26] .
T9 2282-2473 Epistemic_statement denotes It is crucial to detect the harmful cascades as soon as they happen or shortly thereafter, since it allows us to study the causes and prevent further spreading of harmful influence [32, 38] .
T10 2474-2698 Epistemic_statement denotes In practice, it is commonly infeasible or unaffordable to monitor all nodes at all times, and therefore a common way to detect cascades is to select important nodes where we can place sensors for monitoring [7, 19, 31, 39] .
T11 2699-2979 Epistemic_statement denotes However, existing methods usually viewed the cascade data as static and deterministic, ignoring an important fact that the harmful cascades are usually dynamic and time-variant, in the sense that the cascade initiators and diffusion trajectories can change randomly over the time.
T12 2980-3117 Epistemic_statement denotes Actually the dynamic brings new challenges to network monitoring, since it can severely destroy the robustness of selected sensors [22] .
T13 3118-3288 Epistemic_statement denotes Meanwhile the existing sensor selection algorithms can only work for small networks and not be scalable well to large networks of the https://doi.org/10.1016/j.jocs.2017.
T14 3360-3533 Epistemic_statement denotes Motivated by these observations, in this paper we investigate the scalable solutions to sensor selection problem for early detection of dynamic harmful cascades in networks.
T15 4109-4444 Epistemic_statement denotes Under this unified network framework, we specifically propose a new dynamic cascade model to describe harmful cascade diffusions, and define a detection time minimization (DTM) problem S * = argmin |S|=k,S⊆V D(S), where k is a given parameter determined by budget or monitor capacity, S is sensor nodes set, and D(S) is detection time.
T16 4858-5086 Epistemic_statement denotes To further address the scalability issue, we propose an efficient upper bound based greedy (UBG) algorithm and its two accelerations Quickest-Path-UBG and Local-Reduction-UBG to cater for different types of large-scale networks.
T17 5307-5505 Epistemic_statement denotes Detecting the whole network by monitoring finite sensor nodes has been widely applied to detect water contaminations in a water distribution network [19] and virus outbreaks in a human society [7] .
T18 5632-5760 Epistemic_statement denotes However the effects along this idea are commonly unsatisfied, since they ignore the multi-step complexity of network diffusions.
T19 6184-6315 Epistemic_statement denotes In their works the data are a set of deterministic scenarios and the random property of diffusion are not taken into consideration.
T20 6385-6575 Epistemic_statement denotes In addition to the above two types of methods, another related work [1] set sensors along fixed paths in the network so as to gather sufficient information to locate possible contaminations.
T21 7174-7351 Epistemic_statement denotes By contrast, our work is geared to investigate the scalable sensor selection problem for early detection of dynamic harmful cascades in networks from a model-driven prospective.
T22 7352-7659 Epistemic_statement denotes The main difference from previous works is that we take dynamic properties of cascades into consideration: (1) the cascade initiator is dynamic since we do not know who will initiate another harmful diffusion next time, and (2) the diffusion process is dynamic since the propagation trajectory is uncertain.
T23 7660-7789 Epistemic_statement denotes Under these two dynamic sources, we aim to accurately and fast select k nodes as sensors for early detection of harmful cascades.
T24 7790-7958 Epistemic_statement denotes Besides, as the size of network becomes more and more large-scale, we also need to ensure the effectiveness of the proposed sensor selection methods for large networks.
T25 8513-8631 Epistemic_statement denotes • We show the NP-hardness of DTM problem as it can be shown to contain the Set Cover problem as a simple special case.
T26 8844-8913 Epistemic_statement denotes • We theoretically establish new upper bounds for the remaining time.
T27 8914-9128 Epistemic_statement denotes Based on these bounds, we further propose a new Upper Bound based Greedy (UBG in short) algorithm which can significantly reduce the number of estimations in greedy-based algorithms, especially at the initial step.
T28 9355-9485 Epistemic_statement denotes In addition, our methods not only aim at early detection of harmful cascades, but also shed light on a host of other applications.
T29 9486-9639 Epistemic_statement denotes For example, the goal of Emerging Topic Detection is to identify emergent topics in a social network, assuming full access to the stream of all postings.
T30 9895-10150 Epistemic_statement denotes Another example, an emerging trend in algorithmic stock trading is the use of automatic search through the Web and social networks for pieces of information that can be used in trading decisions before they appear in the more popular news sites [23, 27] .
T31 10256-10358 Epistemic_statement denotes The theoretical framework and method established in this paper can also be used in these applications.
T32 10739-10860 Epistemic_statement denotes To prune the unnecessary estimation calls, we analysis the upper bounds for the new proposed UBG algorithm in Section 5 .
T33 11596-11805 Epistemic_statement denotes By considering operational models for the dynamic diffusion of a harmful item through a network G, represented by a directed graph, we will speak of each individual node as being either infected or uninfected.
T34 11806-12027 Epistemic_statement denotes Motivated by the properties of contaminants like rumor and virus, we will focus from now on the progressive case in which nodes can switch from being uninfected to being infected, but do not switch in the other direction.
T35 12166-12450 Epistemic_statement denotes Thus, the process can be reviewed with respect to some particular uninfected node v: as time unfolds, more and more of v's neighbors become infected; at some point, this may cause v to become infected, and v's decision may in turn trigger further infections by nodes connected with v.
T36 12451-12658 Epistemic_statement denotes Motivated by above observation, we put forward a dynamic susceptible-infected (DSI) model for the harmful cascade spreading, which can be seen as a variant of the common susceptible-infected (SI) model [3] .
T37 12905-13000 Epistemic_statement denotes can be seen as a priori knowledge of nodes being infected initially, where u ∈ V (u) = 1 holds.
T38 13294-13417 Epistemic_statement denotes Then, at step t + 1, each node u ∈ J t may infect its out-neighbors v ∈ V \J t with an independent probability of ip(u, v).
T39 14330-14549 Epistemic_statement denotes Note that the detection time (u, S) can be viewed as a special type of stopping time in stochastic process theory [11] .ᮀ Let the variable X denote the random initiator with distribution , i.e., for every u ∈ V, we have
T40 14550-14656 Epistemic_statement denotes The (expected) detection time from a random initiator to one of the selected sensors in S can be define as
T41 14657-14716 Epistemic_statement denotes where E is an expectation operator under the cascade model.
T42 14717-14820 Epistemic_statement denotes By the conditional expectation, we can calculate the expected detection time D(S) in the following way:
T43 14821-14832 Epistemic_statement denotes Indeed, Eq.
T44 14833-14861 Epistemic_statement denotes (5) can be reached like this
T45 14862-14921 Epistemic_statement denotes where (X) denotes the -algebra generated by variable X. Eq.
T46 15082-15245 Epistemic_statement denotes Our goal is to find k nodes as sensors in a network in order to minimize the time until an infection -starting from a random initiator in the network -is detected.
T47 15246-15430 Epistemic_statement denotes Formally, we formulate the problem as the following discrete optimization problem: we want to find a subset S * ⊆ V such that |S * | = k and D(S * ) = min{D(S)||S| = k, S ⊆ V}, i. e .,
T48 15431-15501 Epistemic_statement denotes where k is a given parameter determined by budget or monitor capacity.
T49 15502-15578 Epistemic_statement denotes We call this as detection time minimization problem (DTM problem for short).
T50 15707-15851 Epistemic_statement denotes (6), the first issue is how to calculate the objective function D(S) given a sensor set S. This question isn't as easy as its description in Eq.
T51 16034-16227 Epistemic_statement denotes Thus, it is not initially obvious that the process is even well-defined, in the sense that it yields the same distribution over outcomes regardless of how we schedule the attempted activations.
T52 16228-16365 Epistemic_statement denotes Actually the computation of D(S) is #P-hard, by showing a reduction from the positive partitioned 2-DNF assignment counting problem [9] .
T53 16523-16725 Epistemic_statement denotes ᮀ Since it is intractable to exactly compute D(S) on a typically sized graph, a natural idea is to employ Monte-Carlo methods to estimate D(S), which can be implemented in two different ways as follows:
T54 16726-16912 Epistemic_statement denotes The expected detection time D(S) is obtained by directly simulating the random process of diffusion triggered by a random node, say u, chosen according to the distribution defined as Eq.
T55 17778-17948 Epistemic_statement denotes We can flip all coins a priori to produce a weighted graph G = (V, E, c), where an edge (u, v) is labeled with the time cost c(u, v) -a sample of random variable C(u, v).
T56 18898-19076 Epistemic_statement denotes For estimating a specific D(S), the simulation method is faster, because it only needs to examine a small portion of edges while the snapshot method has to examine all the edges.
T57 19515-19660 Epistemic_statement denotes (6), as it can be shown to contain the Set Cover problem as a simple special case, then we propose a simple greedy algorithm for the DTM problem.
T58 19829-19918 Epistemic_statement denotes ., u n }, we wish to know whether there exist k of the subsets whose union is equal to U.
T59 19919-19953 Epistemic_statement denotes We can assume that k < n < m here.
T60 19954-20031 Epistemic_statement denotes We show that this can be viewed as a special case of the optimal problem (6).
T61 20345-20458 Epistemic_statement denotes Define the probability distribution = { (u), u ∈ V} on the knowledge of nodes being initially infected as follows
T62 20459-20576 Epistemic_statement denotes The Set Cover problem is equivalent to deciding if there is a set S of k nodes in this bipartite graph with D(S) ≤ 1.
T63 20693-21115 Epistemic_statement denotes Monitoring k nodes to detect diffusion initiated from U corresponding to sets in a Set Cover solution results in covering all n nodes corresponding to the ground set U, and if any set S of k nodes has D(S) ≤ 1, then the Set Cover problem must be solvable.ᮀ Since the optimization problem is NP-hard and the network is prohibitively large, we cannot compute the optimum value to verify the actual quality of approximations.
T64 21196-21290 Epistemic_statement denotes To make better use of the greedy algorithm, we consider an equivalent optimization problem Eq.
T65 21502-21531 Epistemic_statement denotes Theoretically, a non-negative
T66 21532-21553 Epistemic_statement denotes for all S ⊆ T. Proof.
T67 22021-22136 Epistemic_statement denotes (14) is proven and we thus get that R(S) is submodular.ᮀ By above properties in Theorem 4, the problem given in Eq.
T68 22137-22408 Epistemic_statement denotes (9) can be approximated by the greedy algorithm in Algorithm 1 with the set function f : = R. For any submodular and monotone function f with f(∅) =0, the problem of finding a set S of size k that maximizes f(S) can be approximated by the greedy algorithm in Algorithm 1.
T69 22580-22755 Epistemic_statement denotes It is shown that the algorithm guarantees an approximation ratio of f(S)/f(S * ) ≥ 1 −1/e, where S is the output of the greedy algorithm and S * is the optimal solution [30] .
T70 22756-22972 Epistemic_statement denotes In Greedy(k, R), a thorny problem is that there is no efficient way to compute R(S) given a placement S, and we turn to run snapshots for 10, 000 trials to obtain an accurate estimate of R(S), mentioned in Section 3.
T71 23291-23552 Epistemic_statement denotes Hence, in order to improve the efficiency of Greedy(k, R), one can either reduce the number of calls for evaluating R(S), or develop advanced heuristic algorithms which can conduct fast and approximate estimations for R(S) at the expense of accuracy guarantees.
T72 23715-23908 Epistemic_statement denotes The principle behind is that the marginal gain of a node in the current iteration cannot be more than that in previous iterations, and thus the number of estimation calls can be greatly pruned.
T73 24159-24399 Epistemic_statement denotes In particular, in the first round to establish the initial upper bounds, CELF needs to estimate R({v}) using MC simulations for each node v, leading to N times of MC calls, which is time-consuming, especially when the network is very large.
T74 24400-24600 Epistemic_statement denotes The limitation leads to a rather fundamental question that, can we derive an upper bound of R({v}) which can be used to further prune unnecessary detection time estimations (MC calls) in Greedy(k, R)?
T75 24728-24870 Epistemic_statement denotes Based on the bound, we propose a new greedy algorithm Upper Bound based Greedy (UBG for short), which outperforms the original CELF algorithm.
T76 24871-24981 Epistemic_statement denotes Essentially different from the bounds that derived for the influence spread under the IC model [40] , we here
T77 24982-25036 Epistemic_statement denotes In this part, we aim to derive an upper bound of R(v).
T78 25288-25364 Epistemic_statement denotes For v ∈ V , the remaining time R(v) under the DSI model can be calculated as
T79 25365-25371 Epistemic_statement denotes Proof.
T80 25505-25646 Epistemic_statement denotes ᮀ Proposition 2 reveals that we can treat the global remaining time R(v) as a summation of all T max propagation steps of local probabilities
T81 25647-25738 Epistemic_statement denotes Based on Proposition 2, a following question is, what is the relationship between two sets,
T82 25739-25753 Epistemic_statement denotes Proposition 3.
T83 25805-25885 Epistemic_statement denotes For k ≥ 1, by the definition of conditional expectation and DSI model, we obtain
T84 25886-25954 Epistemic_statement denotes where the first '≤' is due to 1 {v/ ∈J k−1 } ≤ 1, and the second '≤'
T85 25955-26142 Epistemic_statement denotes x i in the above derivation.ᮀ Proposition 3 clearly identifies the ordering relationship between two adjacent elements in the series P u (v ∈ J 0 ), P u (v ∈ J 1 ), P u (v ∈ J 2 ), · · ·.
T86 26484-26542 Epistemic_statement denotes Now we can rewrite Proposition 3 by using the matrix form,
T87 26543-26627 Epistemic_statement denotes By iteration, we further get that u t ≤ u 0 · (E + IP) t , where E is a unit matrix.
T88 26775-26875 Epistemic_statement denotes With the above preparations, we can present the results on upper bound of remaining time as follows,
T89 26876-26945 Epistemic_statement denotes where E is a unit matrix and [A] (u,v) means the element at position
T90 26946-27029 Epistemic_statement denotes where is a prior distribution on the likelihood of nodes being the infected source.
T91 27030-27101 Epistemic_statement denotes We first use a toy example to explain how to calculate the upper bound.
T92 27286-27370 Epistemic_statement denotes To overcome the difficulty, we adopt the following procedure to calculate Tmax−1 t=0
T93 27371-27493 Epistemic_statement denotes If t ≥1 ≤ T max − 1, it implies that we do not need to calculate (E + IP) t any more when t ≥ t ≥1 , and therefore we have
T94 27494-27546 Epistemic_statement denotes where 1 is a N × N matrix with all elements being 1.
T95 27822-27908 Epistemic_statement denotes Based on the upper bound, we propose a new UBG algorithm for early outbreak detection.
T96 28062-28246 Epistemic_statement denotes However, CELF demands N (the number of nodes in the network) remaining time estimations to establish the initial bounds of marginal increments, which is time expensive on large graphs.
T97 28440-28573 Epistemic_statement denotes This way, the nodes are ranked by their upper bound scores which can be used to prune unnecessary calculations in the CELF algorithm.
T98 28610-28748 Epistemic_statement denotes Even with the UBG algorithm proposed in Section 5, its running time is still unbearable and may not be suitable for large social networks.
T99 28749-28924 Epistemic_statement denotes Although UBG can greatly reduce the number of remaining time estimation calls, each estimation call is very timeconsuming, as it needs produce enough samples and average them.
T100 28925-29091 Epistemic_statement denotes Hence a possible alternative to further accelerate UBG is to employ heuristics to approximate the remaining time R(S) or the marginal return R(S ∪ {u}) − R(S) in UBG.
T101 29234-29474 Epistemic_statement denotes The experimental results will show that (i) both of them are efficient in terms of running time, and (ii) the former one applies to sparse networks better, while the latter one is more effective in dense networks in terms of detection time.
T102 29584-29653 Epistemic_statement denotes (10), the key point of estimating R(S) is how to estimate E[ (u, S)].
T103 29654-29966 Epistemic_statement denotes Note that E[ (u, S)] measures the expected time delay of a diffusion initiated from node u propagating to sensor set S. An intuitive idea is that the most likely propagation path should be the quickest path from node u to the set S. Hence, a question is how to measure the quickest path from node u to the set S?
T104 30037-30177 Epistemic_statement denotes According to the geometric distribution, if a random variable X is distributed geometrically with a parameter p, it follows that E[X] = 1/p.
T105 30178-30442 Epistemic_statement denotes Since the time that a node u spends in infecting its uninfected neighborhood v is distributed geometrically with the parameter ip(u, v), the value 1/ip(u, v) should be the expected time cost that an infection propagates from node u to node v along the edge (u, v).
T106 30696-30841 Epistemic_statement denotes (27) Intuitively, d(u, S) denotes the expected time that an infection propagates from u to S along the quickest path in the graph G = (V, E, m) .
T107 30977-31009 Epistemic_statement denotes Theoretically we have Theorem 6.
T108 31010-31171 Epistemic_statement denotes When the shortest path from u to S in graph G = (V, E, m) is unique, the quantity d(u, S) ∧ T max can be used to approximate the detection time E[ (u, S)], i.e.,
T109 31172-31178 Epistemic_statement denotes Proof.
T110 31437-31476 Epistemic_statement denotes (28), we have the following derivation,
T111 31477-31520 Epistemic_statement denotes Hence we can approximate the remaining time
T112 31521-31611 Epistemic_statement denotes rather than the heavy Monte-Carlo estimation MC(S ∪ {u}) in the 09th row of UBG algorithm.
T113 31706-31945 Epistemic_statement denotes Note that the reason we call this acceleration as Quickest-Path-UBG rather than Shortest-Path-UBG lies in that, our Quickest-Path-UBG introduces a new measurement to reflect the infection time cost rather than the distance in common sense.
T114 32091-32275 Epistemic_statement denotes To facilitate the description, we will approximate the marginal reduction D(S) − D(S ∪ {u}) rather than the marginal return R(S ∪ {u}) − R(S), due to the fact that R(S) = T max − D(S).
T115 32418-32580 Epistemic_statement denotes These methods usually looked over the multi-step complexity of network diffusions, and assumed that the infection can propagate ahead only one hop before the end.
T116 32581-32689 Epistemic_statement denotes In other words, if a node initiates an infection, the diffusion spread is at most its first-order neighbors.
T117 32690-32735 Epistemic_statement denotes We call this as one-hop assumption hereafter.
T118 32973-33116 Epistemic_statement denotes Under this observation, we will propose a tractable method upon the one-hop assumption to approximate the marginal reduction D(S) − D(S ∪ {u}).
T119 33117-33554 Epistemic_statement denotes We first introduce a useful result in probability theory: if the random variable X i is distributed geometrically with parameter p i for i = 1, · · ·, n and they are independent, then ∧ i=1,· · ·,n X i is distributed geometrically with parameter 1 − According to the characteristic of the DSI model, the time cost that an infected node w spends in infecting its uninfected neighbor v is distributed geometrically with parameter ip(w, v).
T120 33680-33774 Epistemic_statement denotes Under the onehop assumption, the expected detection time of w can thus be calculated like this
T121 33775-33896 Epistemic_statement denotes If the initially infected node w has no neighbors in sensor set S, the expected detection time of w is defined as T max .
T122 35738-35822 Epistemic_statement denotes The Twitter and Epinions data sets can both be obtained from Stanford Datasource 2 .
T123 35823-35954 Epistemic_statement denotes Epinions is a general consumer review site where visitors can read reviews about a variety of items to help them decide a purchase.
T124 35955-36072 Epistemic_statement denotes The synthetic Small-world data set is the type of graphs in which each node can be reached by a small number of hops.
T125 36933-37066 Epistemic_statement denotes A heuristic algorithm which requires any pair of sensors to be at least d hops away, where d is as large as it can choose k monitors.
T126 37864-38159 Epistemic_statement denotes We mainly report results on a uniform infection probability of 0.1 assigned to each directed link in the network, i.e., ip(u, v) = 0.1 for any directed edge (u, v) ∈ E. One can refer to the work [14, 33, 36] for learning real values of the parameters {ip(u, v) : (u, v) ∈ E} from available data.
T127 38414-38663 Epistemic_statement denotes Here we select 10 nodes with the highest in-degrees in each network as the sensor set S. We can find that Propagation Simulation and Snapshot Simulation release almost the same estimation results, which confirms their equivalence in estimating D(S).
T128 38912-39098 Epistemic_statement denotes We can see that the cumulative time cost of Propagation Simulation increases linearly, while that of Snapshot Simulation has a big jump at the first sensor set and then increases slowly.
T129 39099-39310 Epistemic_statement denotes The reason behind is that Snapshot Simulation needs to establish numerous snapshots to estimate the first D(S 1 ), and these established snapshots can be reused in the posterior estimations of {D(S i )} 10 i=2 .
T130 39311-39400 Epistemic_statement denotes Table 4 shows the gap between the real value of remaining time R(S) and its upper bounds.
T131 39678-39847 Epistemic_statement denotes From the results, we can observe that the call number in UBG and its two accelerations are significantly reduced compared to that in CELF, especially at the first round.
T132 39848-39985 Epistemic_statement denotes One may notice that in Table 5 , CELF occasionally defeats our methods, but the total call number of our methods are much less than CELF.
T133 40265-40362 Epistemic_statement denotes From the observation, we can conclude that our UBG is more efficient than CELF on large networks.
T134 41459-41543 Epistemic_statement denotes One may argue that such a low improvement of UBG can be neglected in large networks.
T135 42294-42381 Epistemic_statement denotes We assign a uniform infection probability ip to each directed link under the DSI Model.
T136 42407-42502 Epistemic_statement denotes 7 , we can conclude that with the parameter ip growing larger, the CELF is more time-consuming.
T137 43266-43394 Epistemic_statement denotes We take infimum operations on both sides of the first step conclusion, i.e., inf T(u, S 1 ∪ S 2 ) = inf T(u, S 1 ) ∪ T(u, S 2 ).
T138 43734-44069 Epistemic_statement denotes 3) Similarly, in order to prove this property, we only need to , {u}) , i.e., for any t ≥ 0, the probability that t is located in T(u, {v}) equals the probability that t is located in 4) If S ⊂ T, in order to prove this property, we only need to prove T(u, S) ⊂ T(u, T ) and notice the fact that the infimum of a larger set is smaller.
T139 44203-44285 Epistemic_statement denotes So, T(u, S) ⊂ T(u, T ), by which we can easily conclude the property, is verified.
T140 44489-44536 Epistemic_statement denotes Therefore, we can conclude 0 ≤ (u, S) ≤ T max .
T141 44984-45135 Epistemic_statement denotes (A2): G = (V, E) is a directed graph, where each edge e ∈ E is associated with a weight w e which is geometrically distributed with a parameter p, i.e.
T142 45224-45320 Epistemic_statement denotes What is the probability of the event that the shortest path from s to t has length T at least k?
T143 45498-45668 Epistemic_statement denotes Also, we add a source node s and an edge (s, x i ), associated with weight w (s,x i ) which is geometrically distributed with parameter 1/2, for any i ∈ {i : (i, j) ∈ E}.
T144 45960-46215 Epistemic_statement denotes Now we can build a probability-preserving bijection between the valuations of X i , Y j and the subgraphs of the constructed random graph: for a valuation , the corresponding subgraph is the one where each edge adjacent to x i has length 1 iff (X i ) = 1.
T145 46638-47001 Epistemic_statement denotes Note that any edge of length greater than 1 is irrelevant as the structure of the graph, which ensures it can never be part of a path of length 3 from s to t, For any path from s to t must jump three times: from s to some x i , from x i to some y j , and from y j to t. Thus, from what we have stated, (A1) can be reducible to (A2), which implies (A2) is #P-hard.
T146 47002-47057 Epistemic_statement denotes We aim to solve the problem (denoted by (A)) of calcu-
T147 47058-47210 Epistemic_statement denotes In order to prove the random variables T(u, S) and c(u, S) are identically distributed, we need to prove P[T (u, S) = t] = P[c(u, S) = t] for any t > 0.
T148 47211-47274 Epistemic_statement denotes On the one hand, we can prove P[T (u, S) = t] ≤ P[c(u, S) = t].
T149 47275-47385 Epistemic_statement denotes Given t > 0, we find that for any simulation with T(u, S) = t, a snapshot with c(u, S) = t can be constructed.
T150 48125-48193 Epistemic_statement denotes On the other hand, we can also get P[c(u, S) = t] ≤ P[T (u, S) = t].
T151 48194-48278 Epistemic_statement denotes For any snapshot with c(u, S) = t, a simulation with T(u, S) = t can be constructed.
T152 48855-48944 Epistemic_statement denotes We now can construct a simulation of the diffusion model (I t ) t≥0 = (J t+1 \J t ) t≥0 .
T153 49403-49554 Epistemic_statement denotes To address this problem, we proposed an upper bound based greedy (UBG) algorithm and its two accelerative algorithms to cater for lager scale networks.
T154 49792-49840 Epistemic_statement denotes There are several interesting future directions.
T155 49841-50012 Epistemic_statement denotes First, the detection time D(S) is # P-hard to calculate exactly, it is still a question how to design an efficient algorithm to estimate D(S) with a theoretical guarantee.
T156 50013-50204 Epistemic_statement denotes Second, the infection probability ip and prior distribution are predefined in this paper, how to learn these parameters for the DSI model from available cascade data still remains unexplored.