Signal transduction model
We propose to model the dynamic relationships between proteins in a PPI network using a signal transduction network model. Specifically, the signal transduction behavior of the network is modeled using the Erlang distribution, a special case of the Gamma distribution. The Erlang distribution function is:
F ( c ) = 1 − e − x b    ∑ k = 0  c − 1   ( x b   )   k   k !           ( 1 )    MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGgbGrcqGGOaakcqWGJbWycqGGPaqkcqGH9aqpcqaIXaqmcqGHsislcqWGLbqzdaahaaWcbeqaaiabgkHiTmaalaaabaGaemiEaGhabaGaemOyaigaaaaakmaaqahabaWaaSaaaeaadaqadaqaamaalaaabaGaemiEaGhabaGaemOyaigaaaGaayjkaiaawMcaamaaCaaaleqabaGaem4AaSgaaaGcbaGaem4AaSMaeiyiaecaaaWcbaGaem4AaSMaeyypa0JaeGimaadabaGaem4yamMaeyOeI0IaeGymaedaniabggHiLdGccaWLjaGaaCzcamaabmaabaGaeGymaedacaGLOaGaayzkaaaaaa@4DA8@
where c > 0 is the shape parameter, b > 0 is the scale parameter, x ≥ 0 is the independent variable, usually time. The Erlang distribution has several characteristics, which are appropriate for describing the protein-protein interaction network, including its positive range and its important reproductive property [18]. The Erlang distribution with x/b = 1 is used and the value of c is set to the number of edges between source protein node and the target protein node. Setting the value of x/b to unity assesses the perturbation at the target protein when the perturbation reaches 1/e of its initial value at the nearest neighbor of the source protein node.
Erlang distribution models have been used in pharmacodynamics to model signal transduction and transfer delays in a variety of systems including the production of drug induced mRNA and protein dynamics [19] and calcium ion-mediated signaling in neutrophils [20]. The use of the Erlang distribution was motivated by several key physicochemical considerations. In formulating this framework, we noted that sequential cascades of protein-protein interactions are frequently observed in biological signal transduction processes.
In queuing theory, the distribution of time to complete a sequence of tasks in a system with Poisson input is described by the Erlang distribution. Because biological signal transduction can be modeled as a sequence of protein-protein interactions, we sought to apply these queuing results to PPI network modeling. The Erlang distribution also arises naturally in pharmacodynamics, where it has been used to effectively describe the dynamics of signal transduction in systems involving a series of protein compartments, e.g., in response to an unit impulse at time t = 0, the signal transduction from the compartmental model in Figure 1 is equivalent to Erlang distribution. The Erlang distribution is a special case of the Gamma distribution and the latter has been shown to describe population abundances fluctuating around equilibrium [21]; this finding is relevant because perturbations to PPI networks will likewise cause alterations in the levels of bound and unbound protein complexes. Thus, we identified the Erlang distribution as a parsimonious model for describing the dynamics of PPI interactions.
Figure 1  The pharmacodynamic signal transduction model. The pharmacodynamic signal transduction model whose impulse response is an Erlang distribution. The b is the time constant for signal transfer and c is the number of compartments. The Erlang distribution needs to be further modified to reflect network topology. The perturbation induced by the source protein node should be proportional to its degree and to follow the shortest path to the target protein node. During transduction to the target protein node, the perturbation should dissipate at each intermediate visiting node to each incident edge. The signal transduced from node v to node w (v ≠ w) is thus:
S ( v → w ) = d ( v )  ∏ i ∈ P ( v , w )   d ( i )     F ( c )       ( 2 )    MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWucqGGOaakcqWG2bGDcqGHsgIRcqWG3bWDcqGGPaqkcqGH9aqpdaWcaaqaaiabdsgaKjabcIcaOiabdAha2jabcMcaPaqaamaarababaGaemizaqMaeiikaGIaemyAaKMaeiykaKcaleaacqWGPbqAcqGHiiIZcqWGqbaucqGGOaakcqWG2bGDcqGGSaalcqWG3bWDcqGGPaqkaeqaniabg+GivdaaaOGaemOrayKaeiikaGIaem4yamMaeiykaKIaaCzcaiaaxMaadaqadaqaaiabikdaYaGaayjkaiaawMcaaaaa@5189@
where d(i) is the degree of node i, P(v, w) is the set of the all nodes visited en route on the shortest path from node v to node w, excluding the source node v and the target destination node w, and F (c) is the signal transduction behavior function. When v = w and distance (v, w) = 0, we define S (v → w) = d (v). The numerator of the first term in the right hand side of Equation 2 represents the degree of the source node v, and the denominator represents the dissipation on each visiting node on the shortest path from source node v to target node w. Our choice of the shortest path is motivated by the finding that the majority of flux prefers the path of least resistance in many physicochemical and biological systems. There can be more than one shortest path between a node pair in a network. STM chooses the least resistant path, which has the lowest resistance calculated by ∏i ∈ P (v, w) d (i) in Equation 2, out of several tying shortest paths if there are more than one shortest path between a node pair. There also can be more than one least resistant path among several tying shortest paths. Choosing any one path out of several tying least resistant paths makes no difference in measuring the signal transduction quantity as long as it is a least resistant path since the signal quantity computed by Equation 2 depends only on the resistance not on any other topological properties of intermediate visiting nodes on a path. So, the first term in the right hand side of Equation 2 represents the topological effect of source node v on target node w. The second term in the right hand side of Equation 2 represents the biological effect of source node v on target node w in the signal transduction view point. Therefore, the nodes that score the highest value on target node w will be the most influential nodes on node w biologically and topologically.
Figure 2 demonstrates the signal transduction behavior of a small example network according to Equation 2. For the ease of understanding, only the signals from node A, F, G, and H are presented, although signals should be propagated from each node in the network. Each box in Figure 2 contains the signal assessed by the Equation 2 from nodes A, F, G, and H to other target nodes, e.g., 5.0, 0.5057, 0.0396, 0.0054 are the signals assessed from nodes A, F, G, and H, respectively, on node E. These numerical values illustrate overall effects of combining the network topology with the signal transduction model from source nodes A, F, G, and H on node E. Consequently, node A, which has scored the highest value, will be the most influential node on node E biologically and topologically.
Figure 2  A simple network example. Each box contains the numerical values obtained from Equation 2 from nodes A, F, G, and H to other target nodes. Results for other nodes are not shown.