PMC:4307189 / 15294-21521
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/4307189","sourcedb":"PMC","sourceid":"4307189","source_url":"https://www.ncbi.nlm.nih.gov/pmc/4307189","text":"Bayesian statistical modeling of GGR network\n\nNetwork model\nFor statistical modeling of networks, exponential families of distributions offer robust and flexible parametric models [24]. These probabilistic models can be used to evaluate the probability that an edge is present in the network. They can also be used to quantify topological properties of networks by summarizing them in a parametric form and associating sufficient statistics with those parameters [19,24]. In this study, we use a special class of exponential family distributions known as ERGM (Exponential Random Graph Models), also known as the p1-model, which was introduced by Holland and Leinhardt [24].\nA gene-gene relationship network with g genes can be regarded as a random variable X taking values from a set G containing all 2g(g−1) possible relationship networks [24,25]. Let u be a generic point of G which can alternatively be denoted as the realization of X by X = u. Let the binary outcome uij = 1 if genei interacts with genej, or uij = 0 otherwise. Then u is a binary data matrix [19]. Let Pr(u) be the probability function on G given by (1) Pr(u)=Pr(X=u)=1κθexp∑pθpzpu\nwhere zp(u) is the network statistic of type p, θp is the parameter associated with zp(u) and κ(θ) is the normalizing constant that ensures Pr(u) is a proper probability distribution (sums to 1 over all u in G) [26]. The parameter θ is a vector of model parameters associated with network statistics and needs to be estimated. See [24] for further details.\nA major limitation of the p1-model is the difficulty of calculating the normalizing constant, κ(θ), since it is a sum over the entire graph space. Estimating the maximum likelihood of this model becomes intractable as there are 2g(g−1) possible directed graphs (or 2g(g−1)2 undirected graphs), each having g nodes (genes). A technique called maximum pseudolikelihood estimation has been developed to address this problem [27]. This technique employs MCMC methods such as Gibbs or Metropolis-Hastings sampling algorithms [28].\nThe construction of the p1-model for a directed network is described in an Appendix Additional file 1: Appendix I. For the gene-gene relationship network with undirected edges, the description of the p1-model can be simplified by using only two Bernoulli variables Yij0 and Yij1 instead of four as follows: Yijk=1ifuij=k,0otherwise\nThe simplified p1-model can then be defined using the following two equations to predict the probability of an edge being present between genei and genej: (2) logPrYij1=1=λij+θ+αi+αj\n(3) log Pr Y ij 0 = 1 = λ ij\nfor i\u003cj. Note that λij is chosen to ensure Pr(Yij0=1)+Pr(Yij1=1)=1. In this formulation, the expansiveness and attractiveness parameters were reduced to a single parameter, α, which represents the propensity of a gene to be connected in an undirected network. Hence, the p1-model seeks to find the probabilities of edge formation in a network considering its structural features explicitly.\n\nBayesian modeling\nWe used a fully Bayesian approach for modeling our gene-gene relationship network. Parameter estimation is a crucial step in statistical modeling, for which a classical approach is maximum likelihood estimation (MLE). However, unlike MLE, Bayesian techniques involve calculation of posterior probabilities of model parameters by training the model with given data. We assume that the data follows the generative model , and assign a prior probability Pθ|ℳ to the parameter vector θ under the model . Then Bayes’ rule for calculating posterior probability is as follows: (4) Prθ|ℳ,D=PrD|θ,ℳ×Prθ|ℳZ\nwhere PrD|θ,ℳ is the likelihood function. Now, the marginal likelihood can be expressed as (5) Z=PrD|ℳ=∫PrD|ℳ,θ×Pθ|ℳdθ,\nComputing the exact solution for the marginal likelihood is often intractable since it is prone to the curse of dimensionality. Fortunately, Markov Chain Monte Carlo (MCMC) methods such as Gibbs sampling and Metropolis-Hastings methods do not require to be explicitly computed. In general, MCMC methods are stochastic simulation techniques which generate samples from the joint distribution Pℳ,θ|D for calculating the posterior probabilities of parameters. Here we used Gibbs sampling methods, which sample iteratively, one parameter at a time, from the full conditional distribution given the current and previous values of all other parameters. To implement Gibbs sampling, we employed WinBUGS [29], which is a high-level software package providing an easy interface for implementing complex Bayesian models. In WinBUGS, users are free from background lower-level programming details, and only have to express the model precisely.\nWe hypothesized that gene-pairs involved in drug resistance are likely to be found with high probabilities in the resistant network but low probabilities in the parental network. Therefore, we built two networks, one from resistant datasets and the other from parental datasets. In this Bayesian approach, the model likelihood is defined in Equations (2) and (3), where Yk is the data matrix calculated from the observed data u. Here we have two Yk data matrices, namely a gene-gene relationship network YkR derived from resistant samples and YkP derived from parental samples.\nOur approach is a hierarchical Bayesian model in that model parameters are in turn dependent on hyperparameters. We assign the density parameter θ in Equation (2) a normal prior distribution with mean 0 and standard deviation σθ. (6) θ∼N0,σθ2\nNote, in WinBUGS the parameter τ, called the precision, replaces the standard deviation parameter σ of the normal distribution, where, τ=σ−2. For the hyperparameter τθ we specify a gamma prior distribution as follows, since it is a conjugate prior for the normal distribution: (7) τθ∼Gammaa0,b0\nWe set a0 = 0.001 and b0 = 0.001 to make the prior for θnoninformative, making its standard deviation wide to express large uncertainty [19]. For attractiveness/ expansiveness parameters αi and αj, we followed the approach used by Adams et al. [30]. (8) αiRαiP∼N00,Σ\n(9) Σ − 1 ∼ Wishart 1 0 0 1 , 2\nHere, αiR and αiP represent the expansiveness/attractiveness parameters for the network model of resistant and parental conditions, respectively.","divisions":[{"label":"title","span":{"begin":0,"end":44}},{"label":"sec","span":{"begin":46,"end":2980}},{"label":"title","span":{"begin":46,"end":59}},{"label":"p","span":{"begin":60,"end":674}},{"label":"p","span":{"begin":675,"end":1153}},{"label":"label","span":{"begin":1122,"end":1125}},{"label":"p","span":{"begin":1154,"end":1510}},{"label":"p","span":{"begin":1511,"end":2036}},{"label":"p","span":{"begin":2037,"end":2368}},{"label":"p","span":{"begin":2369,"end":2551}},{"label":"label","span":{"begin":2524,"end":2527}},{"label":"p","span":{"begin":2552,"end":2589}},{"label":"label","span":{"begin":2552,"end":2555}},{"label":"p","span":{"begin":2590,"end":2980}},{"label":"title","span":{"begin":2982,"end":2999}},{"label":"p","span":{"begin":3000,"end":3597}},{"label":"label","span":{"begin":3571,"end":3574}},{"label":"p","span":{"begin":3598,"end":3718}},{"label":"label","span":{"begin":3690,"end":3693}},{"label":"p","span":{"begin":3719,"end":4653}},{"label":"p","span":{"begin":4654,"end":5231}},{"label":"p","span":{"begin":5232,"end":5474}},{"label":"label","span":{"begin":5462,"end":5465}},{"label":"p","span":{"begin":5475,"end":5769}},{"label":"label","span":{"begin":5752,"end":5755}},{"label":"p","span":{"begin":5770,"end":6036}},{"label":"label","span":{"begin":6020,"end":6023}},{"label":"p","span":{"begin":6037,"end":6081}},{"label":"label","span":{"begin":6037,"end":6040}}],"tracks":[{"project":"2_test","denotations":[{"id":"25599599-20100321-14868122","span":{"begin":464,"end":466},"obj":"20100321"},{"id":"25599599-20100321-14868123","span":{"begin":1065,"end":1067},"obj":"20100321"},{"id":"25599599-20100321-14868124","span":{"begin":5907,"end":5909},"obj":"20100321"}],"attributes":[{"subj":"25599599-20100321-14868122","pred":"source","obj":"2_test"},{"subj":"25599599-20100321-14868123","pred":"source","obj":"2_test"},{"subj":"25599599-20100321-14868124","pred":"source","obj":"2_test"}]}],"config":{"attribute types":[{"pred":"source","value type":"selection","values":[{"id":"2_test","color":"#ecec93","default":true}]}]}}