PMC:1687185 / 22464-28753
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/1687185","sourcedb":"PMC","sourceid":"1687185","source_url":"https://www.ncbi.nlm.nih.gov/pmc/1687185","text":"4 Novel Framework\nOur framework consists of the following three phases:\n• Global phase in which the promising solutions in the entire search space are obtained.\n• Refinement phase where a local method is applied to the solutions obtained in the previous phase in order to refine the profiles.\n• Exit phase where the exit points are computed and the Tier-1 and Tier-2 solutions are explored systematically.\nIn the global phase, a branch and bound search is performed on the entire dataset. All of the profiles that do not meet a certain threshold (in terms of a given scoring function) are eliminated in this phase. The promising patterns obtained are transformed into profiles and local improvements are made to these profiles in the refinement phase. The consensus pattern is obtained from each nucleotide that corresponds to the largest value in each column of the PSSM. The 3l variables chosen are the nucleotides that correspond to those that are not present in the consensus pattern. Because of the probability constraints discussed in the previous section, the largest weight can be represented in terms of the other three variables.\nTo solve (4), current algorithms begin at random initial alignment positions and attempt to converge to an alignment of l - mers in all of the sequences that maximize the objective function. In other words, the l - mer whose log(A)i is the highest (with a given PSSM) is noted in every sequence as part of the current alignment. During the maximization of A(Q) function, the probability weight matrix and hence the corresponding alignments of l - mers are updated. This occurs iteratively until the PSSM converges to the local optimal solution. The consensus pattern is obtained from the nucleotide with the largest weight in each position (column) of the PSSM. This converged PSSM and the set of alignments correspond to a local optimal solution. The exit phase where the neighborhood of the original solution is explored in a systematic manner is shown below:\nInput: Local Maximum (A).\nOutput: Best Local Maximum in the neighborhood region.\nAlgorithm:\nStep 1: Construct the PSSM for the alignments corresponding to the local maximum (A) using Eqs.(1) and (2).\nStep 2: Calculate the eigenvectors of the Hessian matrix for this PSSM.\nStep 3: Find exit points (e1i) on the practical stability boundary along each eigenvector direction.\nStep 4: For each of the exit points, the corresponding Tier-1 local maxima (a1i) are obtained by applying the EM algorithm after the ascent step.\nStep 5: Repeat this process for promising Tier-1 solutions to obtain Tier-2 (a2j) local maxima.\nStep 6: Return the solution that gives the maximum information content score of {A, a1i, a2j}.\nFig. 2 illustrates the exit point method. To escape out of this local optimal solution, our approach requires the computation of a Hessian matrix (i.e. the matrix of second derivatives) of dimension (3l)2 and the 3l eigenvectors of the Hessian. The main reasons for choosing the eigenvectors of the Hessian as search directions are:\nFigure 2 Diagram illustrates the exit point method of escaping from the original solution (A) to the neighborhood local optimal solutions (a1i) through the corresponding exit points (e1i). The dotted lines indicate the local convergence of the EM algorithm. • Computing the eigenvectors of the Hessian is related to finding the directions with extreme values of the second derivatives, i.e., directions of extreme normal-to-isosurface change.\n• The eigenvectors of the Hessian will form the basis vectors for the search directions. Any other search direction can be obtained by a linear combination of these directions.\n• This will make our algorithm deterministic since the eigenvector directions are always unique.\nThe value of the objective function is evaluated along these eigenvector directions with some small step size increments. Since the starting position is a local optimal solution, one will see a steady decline in the function value during the initial steps; we call this the descent stage. Since the Hessian is obtained only once during the entire procedure, it is more efficient compared to Newton's method where an approximate Hessian is obtained for every iteration. After a certain number of evaluations, there may be an increase in the value indicating that the current point is out of the stability region of the local maximum. Once the exit point has been reached, few more evaluations are made in the direction of the same eigenvector to ensure that one has left the original stability region. This procedure is clearly shown in Fig 3. Applying the local method directly from the exit point may give the original local maximum. The ascent stage is used to ensure that the new guess is in a different convergence zone. Hence, given the best local maximum obtained using any current local methods, this framework allows us to systematically escape out of the local maximum to explore surrounding local maxima. The complete algorithm is shown below :\nFigure 3 A summary of escaping out of the local optimum to the neighborhood local optimum. Observe the corresponding trend of A(Q) at each step. Input: The DNA sequences, length of the motif (1), Maximum Number of Mutations (d)\nOutput: Motif (s)\nAlgorithm:\nStep 1: Given the sequences, apply Random Projection algorithm to obtain different set of alignments.\nStep 2: Choose the promising buckets and apply EM algorithm to refine these alignments.\nStep 3: Apply the exit point method to obtain nearby promising local optimal solutions.\nStep 4: Report the consensus pattern that corresponds to the best alignments and their corresponding PSSM.\nThe new framework can be treated as a hybrid approach between global and local methods. It differs from traditional local methods by computing multiple local solutions in the neighborhood region in a systematic manner. It differs from global methods by working completely in the profile space and searching a subspace efficiently in a deterministic manner. For a given non-convex function, there is a massive number of convergence regions that are very close to each other and are separated from one another in the form of different basins of attraction. These basins are effectively modeled by the concept of stability regions.","divisions":[{"label":"title","span":{"begin":0,"end":17}},{"label":"p","span":{"begin":18,"end":71}},{"label":"p","span":{"begin":72,"end":160}},{"label":"p","span":{"begin":161,"end":292}},{"label":"p","span":{"begin":293,"end":405}},{"label":"p","span":{"begin":406,"end":1139}},{"label":"p","span":{"begin":1140,"end":2001}},{"label":"p","span":{"begin":2002,"end":2027}},{"label":"p","span":{"begin":2028,"end":2082}},{"label":"p","span":{"begin":2083,"end":2093}},{"label":"p","span":{"begin":2094,"end":2201}},{"label":"p","span":{"begin":2202,"end":2273}},{"label":"p","span":{"begin":2274,"end":2374}},{"label":"p","span":{"begin":2375,"end":2520}},{"label":"p","span":{"begin":2521,"end":2616}},{"label":"p","span":{"begin":2617,"end":2711}},{"label":"p","span":{"begin":2712,"end":3044}},{"label":"figure","span":{"begin":3045,"end":3303}},{"label":"label","span":{"begin":3045,"end":3053}},{"label":"caption","span":{"begin":3055,"end":3303}},{"label":"p","span":{"begin":3055,"end":3303}},{"label":"p","span":{"begin":3304,"end":3488}},{"label":"p","span":{"begin":3489,"end":3665}},{"label":"p","span":{"begin":3666,"end":3762}},{"label":"p","span":{"begin":3763,"end":5017}},{"label":"figure","span":{"begin":5018,"end":5163}},{"label":"label","span":{"begin":5018,"end":5026}},{"label":"caption","span":{"begin":5028,"end":5163}},{"label":"p","span":{"begin":5028,"end":5163}},{"label":"p","span":{"begin":5164,"end":5246}},{"label":"p","span":{"begin":5247,"end":5264}},{"label":"p","span":{"begin":5265,"end":5275}},{"label":"p","span":{"begin":5276,"end":5377}},{"label":"p","span":{"begin":5378,"end":5465}},{"label":"p","span":{"begin":5466,"end":5553}},{"label":"p","span":{"begin":5554,"end":5660}}],"tracks":[]}