Experimental Results
Here, we summarize the performance of Rfold presented in the original paper. For more details, see the original reference.
Dataset
We extracted 151 alignments of structural RNA families from
the seed alignments in the Rfam7.0 database.
All these alignments had annotated secondary structures that had been published in the literature.
We then selected a single representative sequence from each family that had the maximal number of canonical base pairs.
From these sequences, we created four types of dataset (Datasets1--4).
Dataset1 comprises these 151 RNA sequences.
Dataset2 is created from Dataset1 by appending random sequences
of length e = 100, 300, 500, and 1000 to both the ends of each sequence.
Dataset3 contains a single sequence of length 172k bases,
which is obtained by concatenating the sequences of Dataset1 and the random sequences of length 1000 alternately.
Dataset4 comprises 10 random sequences of length 10k bases, and it is used as the control set
to estimate the false positive rate of the structure predictions.
The random sequences of these datasets were generated
by concatenating the 151 RNA sequences and shuffling the nucleotides of the sequence, while conserving the dinucleotide frequency.
Accuracy Measures
To estimate the accuracy of the base pairing probabilities (BPPs) p(i, j) and the structure predictions,
we draw receiver operator characteristic (ROC) curves that represent the balance
between the sensitivity to the true base pairs
and the rate of false positives in the non-structured sequences.
In the case of BPP comparison, the sensitivity is defined by
the fraction of the true base pairs that have a BPP larger than the given threshold value p0.
The false positive rate is defined by the frequency of the base pairs (i, j)
with p(i,j)> p0 in the non-structured sequence divided by the length of the sequence.
We draw the ROC curve by examining several values of p0.
In the case of structure prediction, the sensitivity is defined by the fraction of base pairs
that are correctly predicted by the programs.
We define the false positive rate as the fraction of the inner regions (i.e. the segments enclosed by any base pair) in the non-structured sequence.
This definition penalizes long inner regions that contain only a small number of predicted base pairs.
Furthermore, only the base pairs that satisfy the maximal span constraint are counted as true base pairs
in order to remove the effect of trivial loss of sensitivity to the distant base pairs |i-j|>W.
Comparison of the quality of the computed local base pairing probabilities

Comparison of the accuracy of local structure predictions
We used Dataset4 for the computation of the false positive rate and Dataset3 for the sensitivity.
We examined three values of W ---
50 (circle), 100 (triangle), and 200 (square).
Since RNALfold(denoted by "Lfold" in the figure) has no parameter that strikes the balance
between the sensitivity and the false positive rate,
only one point is plotted for each values of maximal span W.
Comparison of Running Time
| BPP Computation | Rfold | 15min |
| RNAplfold | 12min | |
| Structure Prediction | Rfold | 22min |
| RNALfold | 30sec |
We used Dataset3, which consists of a sequence with a length of 172k bases.
The maximal span W is set to be 100.