Experiment results We have done experiments to test the speed and sensitivity of the software. Speed Testing The time complexity of the algorithm is O(n2). To test the speed in practice, we use arbitrarily generated DNA and protein sequences. We ran our software on a PC with Pentium 4 3.4G CPU and 1GB memory, the result is shown in Table 3. We can see that for long DNA and protein sequences, our software can get the result in short time. For example, if the length of the sequence is 10000, it takes about 10.8 seconds and 26.9 seconds for DNA sequences and protein sequences, respectively. In some real applications, the length of sequences could be much longer than 10000. In this case, one can cut the long sequence into several short pieces and find out the repeated regions for each piece. If a region covers two pieces, then we can re-cut that segment to get that region. Table 3 Results for the speed test of LocRepeat Length 2000 4000 6000 8000 10000 DNA 0.5s 2.2s 4.8s 7.0s 10.8s PROTEIN 1.1s 4.3s 10.0s 17.7s 26.9s Sensitivity testing using real data We applied LocRepeat to the DNA sequence gene PRNP which contains tandem repeats (GenBank:M13667). The length of the sequence is 2420. We find the local optimal pseudo-periodic region [215,327], that contains 5 pseudo-periodic units (Table 4). The pseudo-periodic region misses the first several sites of the tandem repeats, but the region and the partitions show the tandem repeats correctly. We also applied LocRepeat to the protein sequence LGR6 (Swiss-Prot: Q9HBX8). The length of the sequence is 828. We use PAM120 as the similarity score matrix and find the local units (Table 5). Table 4 Local optimal pseudo-periodic region for PRNP Unit Position Length Unit 1 215–238 24 ggtggtggctgggggcagcctcat 2 239–262 24 ggtggtggctgggggcagcctcat 3 263–286 24 ggtggtggctgggggcagccccat 4 287–310 24 ggtggtggctggggacagcctcat 5 311–327 17 ggtggtggctggggtca Table 5 Local optimal pseudo-periodic region for LGR6 Unit Position Length Unit 1 30–53 24 LSMNNLTELQPGLFHHLRFLEELR 2 54–77 24 LSGNHLSHIPGQAFSGLYSLKILM 3 78–101 24 LQNNQLGGIPAEALWELPSLQSLD 4 102–124 23 LNYNKLQEFPVAIRTLGRLQELG 5 125–148 24 FHNNNIKAIPEKAFMGNPLLQTIH 6 149–172 24 FYDNPIQFVGRSAFQYLPKLHTLS 7 173–195 23 LNGAMDIQEFPDLKGTTSLEILT 8 196–219 24 LTRAGIRLLPSGMCQQLPRLRVLE 9 220–241 22 LSHNQIEELPSLHRCQKLEEIG 10 242–265 24 LQHNRIWEIGADTFSQLSSLQALD 11 266–289 24 LSWNAIRSIHPEAFSTLHSLVKLD 12 290–311 22 LTDNQLTTLPLAGLGGLMHLKL In conclusion, the algorithm presented in this paper offers the possibility to find regions of pseudo-periodic repeats in a long sequence.