Functions for visualization
After obtaining PPI data which has been downloaded from our website or privately generated, users can look up all the possible interactions for a given protein using the networkView function. Upon providing a gene name or the Swiss-Prot number for a given protein, proteins that are capable of interacting with the input protein are presented as shown in Figure 1A. The given protein is represented by a relatively larger blue node, while related proteins are presented as smaller green nodes (Figure 1A). These nodes can be clicked, which provide links to the UniProt database for the extraction of more information. Protein interactions are also presented in a table (Figure 1B), where the names of databases that support each specific interaction are displayed. PubMed IDs of corresponding publications and/or STRING scores are also displayed, providing users with direct links for the verification of corresponding sources as desired. In cases where there are more than 100 protein interactions for a given protein, the viewer randomly displays 100 of these interactions (Figure 1A), but all protein interactions and corresponding details are presented in the table (Figure 1B).
Figure 1  Screenshots of the networkView function outputs. (A) Visualization of the protein TP53 and interacting proteins. (B) Evidence supporting the specific interactions among these proteins.   The networkView function can also be used to visualize PPI networks in a given list of proteins, together with corresponding evidence of the specific interactions among them. A sample output is shown in Figure 2A, where the selected proteins are TP53, TP53BP2, MAGI1, and PTEN. The first three proteins are designated as main nodes and PTEN is designated as a leaf node. Since interaction between two proteins is often mediated by scaffolding proteins rather than direct interaction, the viewer also displays proteins that can interact with at least two of the main proteins, such as the green leaf nodes in Figure 2A. Additionally, users are free to choose the visualization style (color and node size) with which the proteins are displayed in the network. Using the parameter mainNode for this function, a selected protein can be designated as a main or leaf node. In this example, since PTEN is manually designated as a leaf node unlike the other three input proteins designated as main nodes, the only interactions presented for PTEN are those with the main nodes. Thus, views are generated corresponding to user preference. A protein often has multiple gene names, some of which may not be included in the input PPI data file. To avoid inputting invalid names of proteins, the unique identifier Swiss-Prot accession number may be used alternatively as input. Swiss-Prot accession numbers may be found in the UniProt database.
Figure 2  Screenshots of the cisPath function outputs and network graph editor. (A) PPI network visualization of the proteins TP53, TP53BP2, MAGI1, and PTEN. (B) Shortest interaction paths between proteins TP53 and STRAP. (C) Network graph editor.   In some cases, users may want to identify interaction paths with more than two interacting steps between a pair of given proteins in a PPI network, and another function may be used to yield this type of result. The function cisPath identifies and outputs the shortest PPI paths between a pair of given proteins involved in multiple interaction steps. Users can obtain the shortest path(s) by either directly requesting the path(s) that reflect minimal cost using the default "cost" values of edges, or manually assigning "costs" to specific edges in the PPI network by editing the input file. The "cost" of an edge between two interacting proteins is a numerical value that is greater or equal to one, quantifying the extent to which an interaction is unfavorable. The default value for the "cost" of each edge generated from the PINA and iRefIndex databases is 1, and the "cost" of the edge generated from the STRING database is given as max(1,log1001000-STRING_SCORE). The variable STRING_SCORE is the confidence score given by the STRING database. An example of this function is shown in Figure 2B. Evidence representing the STRING score or PubMed ID of relevant manuscripts is shown for all interaction paths. Similar to the networkView function, other proteins that can interact with at least two of the proteins that lie on the shortest PPI path are also displayed, giving a full range of possibilities despite the fact that they may be suboptimal paths. All of the shortest paths are listed in a table under the network view and can be shown graphically when selected (Figure 2B). To identify the paths that reflect the least number of steps independent of what the associated "costs" are, the parameter byStep may be set as TRUE. In this case, all edge "costs" are assigned as 1 and PPI paths with the minimum number of steps between a pair of given proteins are produced.
Research groups that focus on specific proteins may require screening of the shortest interaction paths from a single fixed protein to all other proteins in the input database. In this case, only the source protein name should be inputted in the cisPath function. All proteins in the input database are scanned for the shortest interaction paths to the fixed protein, and all of the shortest PPI paths from the fixed protein to each of the relevant proteins are outputted. Upon finding a new protein of interest, users can query the shortest interaction paths to the fixed protein with a browser without launching R. Although more CPU time and space is required to compute this function and store the results, results can be easily placed on a cloud driver or web server for quick access over the Internet. Sample results for fixed source proteins TP53 and PTEN can be found on our website.
The functions networkView and cisPath described above allow users to change color and size of the nodes in the network view prior to running. There is an additional editor for easy modification of network graphs after running. Figure 2C shows a screenshot of this tool. This editor is accessed via an "Edit graph" button on the output webpage, and allows users to make changes to the output graph as well as draw new network graphs that are directed or undirected, using different edge and arrow styles. The editor is compatible across a range of different browsers. Since most commonly used browsers support the HTML5 Web Storage, users can store the network graph view and open it later using the same browser. An additional function of this editor allows the view graph to be converted into a span of text. As the text is reversible to an editable view graph, it is possible to share output graphs easily via email or online messenger. This editor is independently usable, and is included in the source package. It is also available on our website for online access or downloading for offline usage.