4 Results In this section, we report results on analyzing the two trajectories of the small synthetic protein BBA5 and the five trajectories of another small protein GSGS. However, we will focus on BBA5. In previous sections, we have described in detail the structure of BBA5 of GSGS and their folding trajectories. Such information is summarized and tabulated in Table 4 and Table 5. Table 4 A summary of the BBA5 folding trajectories. Protein PDB Identifier: BBA5; Primary sequence: 23 residues; Designed protein;Native fold: N-terminal 1–10 β hairpin, C-terminal 11–23 α-helix Trajectory Two trajectories: T23 and T24;T23: 192 conformations; T24: 150 conformations Contact map Based on contacts between α-carbons.Two α-carbons are in contact if their Euclidian distance is ≤ 8.5 Å Bit-patterns A total of 352 unique maximally connected bit-patterns were identified from all conformations;Average number of bit-patterns per conformation is 6;Bit-patterns are further classified into 10 approximately equivalent types Interacting bit-patterns If at least one pair of α-carbons, one from each bit-pattern, is of Euclidian distance ≤ 10 Å Frequent SOAPs A SOAP is frequent if it appears in ≥ 5 conformations;A total of 444 frequent SOAPs identified in trajectory T23, and 258 in T24 Consensus partial folding pathway We identified a consensus partial folding pathway across the two trajectories.It is composed of 71 pairs of similar conformations, one from each trajectory Table 5 A summary of the GSGS folding trajectories. Protein Name: GSGS or Beta3s; Primary sequence: 20 residues; Designed protein;Native fold: three stranded anti-parallel β-sheets with turns at 6–7 and 14–15 Trajectory Five trajectories: T1, T2, T3, T4 and T5;T1 : 25, 664 conformations; T2 : 30, 075 conformations;T3 : 19, 649 conformations; T4 : 25, 263 conformations;T5 : 25, 664 conformations; Contact map Based on contacts between α-carbons.Two α-carbons are in contact if their Euclidian distance is ≤ 8.5 Å Bit-patterns A total of 50, 572 unique maximally connected bit-patterns were identified from all conformations;Average number of bit-patterns per conformation is 4;Bit-patterns are further classified into 12 approximately equivalent types Interacting bit-patterns If at least one pair of α-carbons, one from each bit-pattern, is of Euclidian distance ≤ 10 Å Frequent SOAPs A SOAP is frequent if it appears in ≥ 10 conformations; 4.1 Detecting and Ordering Folding Events We summarize both folding trajectories of BBA5 into a sequence of SOAPs as illustrated in Figure 9. Coincidently, both summarized trajectories consist of 64 conformations. Based on these summarized trajectories, we can quickly identify all the conformations where the first α-helix-like or β-turn-like local motifs were formed. For trajectory T23, the first α-helix-like motif was identified in frame 26, and the first β-turn-like local motif was formed in frame 63. For the other trajectory T24, the frames were 29 and 38. This is in accordance with experimental results that α-helices generally fold more rapidly than β-turns. However, since we only consider frequent SOAPs, it is very possible that we might miss the actual first formation of such local motifs. To address this issue, we might need to consider rarely occurring SOAPs. We plan to investigate this in the future. For the two events related to β-turn formation, formation of two extended strands and formation of the turn, we found that for both trajectories, the formation of extended strands preceded the formation of the turn. Also, we identify two conformations in each trajectory that show native-like structure. We do this by locating the conformations associated with the generalized SOAP (β.1 α.2). Figure 10 presents the 3D structure of these native-like conformations along with the native conformation of BBA5. One can see that our SOAP-based comparison does well in identifying similar 3D conformations. Figure 10 The native-like conformations identified in the two BBA5 trajectories. According to the SOAP-based summarization of the two BBA5 folding trajectories, two native-like conformations are identified in each trajectory. 4.2 Consensus Partial Folding Pathway Across Trajectories Based on the generalized trajectory summarization of BBA5, we identify a consensus partial folding pathway of length 71. In other words, 71 pairs of conformations, one from each trajectory, are considered similar to each other. Figure 3 displays four such pairs along this consensus folding pathway. For instance, the two conformations shown in Figure 3(c), corresponding to the 182th frame in the T23 trajectory and the 116th frame in the T24 trajectory of BBA5 respectively, are considered structurally similar, since both conformations exhibit an α-helix in the left half of the backbone, and a β-turn in the right half. Figure 11 illustrate 5 pairs of conformations along the consensus folding pathway of the 1st and 3rd trajectories of GSGS. And Figure 12 illustrates 5 conformation-pairs along consensus pathway of the 1st and 5th trajectories of GSGS. We are currently in the process of identifying consensus pathways across more than 2 trajectories of GSGS. Note that by using bit-patterns, we naturally realize a rotation-invariant comparison. To illustrate this, let us again examine the afore-discussed conformation pair of BBA5. One notices that although the β-turn in the two conformations orients differently, the two conformations are still identified as being structurally similar by our approach. Figure 11 Selected conformation-pairs along the consensus partial folding pathway across the 1st and 3rd trajectories of the GSGS peptide. The figure illustrates five pairs of conformations, one from each trajectory, along the consensus partial folding pathway identified in the 1st and 3rd trajectories. Figure 12 Selected conformation-pairs along the consensus partial folding pathway across the 1st and 5th trajectories of the GSGS peptide. The figure illustrates five pairs of conformations, one from each trajectory, along the consensus partial folding pathway identified in the 1st and 5th trajectories. Currently, we rely on visual tools to justify these consensus pathways. We did attempt to use several measurements that have been used previously to quantify the similarity between 3D protein conformations, but to no avail. These measurements include RMSD, contact order, and native contacts. If we identify the pathway based on the best match given by any of the above measurements, we often ended up with a very short consensus pathway (as short as 10 frames). Two conformations are said to be a best match if they have the lowest RMSD or have the smallest difference in contact order or native contacts. Moreover, different best-matched measurements rendered very different consensus pathways. Finally, we notice that the best-matched conformations based on any of such measurements can often exhibit very different structural characteristics. We are investigating alternative methods for quantitative validation of our results.