readme file for source code
Click here to get the file
Size
4.7 kB
-
File type
text/plain
File contents
Program name:
Rfold
Author:
Hisanori Kiryu
Computational Biology Research Center,
The National Institute of
Advanced Industrial Science and Technology (AIST), Japan.
E-mail: kiryu-h AT aist.go.jp
License:
The source files in ./src/vienna/ directory
contain the energy parameters extracted
from the Vienna RNA package (version 1.5) (see Reference below)
Please follow ./src/vienna/COPYING file,
which describes the copyright notice of the Vienna RNA package.
The other part of the source codes is provided as free software.
They are distributed in the hope that they will be useful
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Permission is granted for research, educational, and commercial use
and modification so long as
1) the package and any derived works are not
redistributed for any fee, other than media costs,
2) proper credit is given to the author and the
Computational Biology Research Center, AIST, Japan.
If you want to include this software in a commercial product, please contact
the author.
Citation:
Rfold: An exact algorithm for computing local base pairing probabilities
Hisanori Kiryu; Taishin Kin; Kiyoshi Asai
Bioinformatics 2007; doi: 10.1093/bioinformatics/btm591
Reference:
Hofacker, I. (2003) Vienna RNA secondary structure server.
Nucleic Acids Res., 31, 342931.
Install:
The program was tested using
gcc 4.0 on linux machines and gcc 3.4 on cygwin
Some gcc specific features are currently used.
cd ./src/
make
cd ../
Usage:
run_rfold [options] seqfile
seqfile:
sequence file in fasta format.
options:
-command=<COMPUTE_MEA_FOLD|COMPUTE_PROB>
COMPUTE_MEA_FOLD predicts local secondary structures by using
the maximal expected accuracy (MEA) method.
COMPUTE_PROB computes only the base pairing and loop probabilities.
(default: COMPUTE_MEA_FOLD)
-max_pair_dist=<integer>
set the maximal allowed spans W of base pairs.
set to (-1) for the computation without the constraint on the
maximal span. (default: -1)
-outfile=<filename>
set the output file for structure predictions. (default: rfold_out.txt)
-mea_outer_loop_coeff=<float>
compositional weight Co for outer bases. (default: 1.25)
The compositional weight Cp for base pairs is always set to Cp=1.0
-mea_inner_outer_ratio=<float>
set the ratio Ci/Co of the compositional weights Ci, Co for the inner
and outer unpaired bases, respectively. (default: 0.75)
-print_prob=<bool>
set to true in order to print out the base pair probabilities
when -command=COMPUTE_MEA_FOLD (default: false)
-print_loop_prob=<bool>
set to true to print out loop probabilities Prob_L(i) (default: false)
-prob_file=<filename>
set output file for base pair and loop probabilities (default: rfold_prob.txt)
Example:
./src/run_rfold -max_pair_dist=30 -print_prob=true ./samples/RNaseP_nuc.fa
Output:
Both output files for structure predictions and base pairing
probability computations consist of multiple lines of tab delimited
columns.
Sequence positions are represented in 1-based coordinate.
Structure prediction:
column 1) start position of the inner region
2) last position of the inner region
3) MEA score of the inner region
4) secondary structure prediction
Example (see ./samples/rfold_out.txt):
#start end score structure
10 26 3.529640197753906 <<<<<......>>.>>>
80 92 2.522617340087891 <<<<.....>>>>
94 108 3.849872589111328 <<<<<.....>>>>>
Probability computation:
column 1) left base position of the base pair
2) right base position of the base pair
In the case of loop probabilities columns 1 and 2 are
identical.
3) probability value
4) probability type: 'P' for base pairing probabilities,
'L' for loop probabilities.
Example (see ./samples/rfold_prob.txt):
#left_pos right_pos prob type
316 320 2.0778e-06 P
313 317 3.64299e-05 P
313 318 0.797102 P
313 319 0.000241128 P
312 316 4.03667e-07 P
312 317 1.05298e-06 P
312 318 9.13579e-05 P
Version History:
0.1-1 28-January-2008 Changed compile flag to use 'double' numbers
(rather than 'float') in all the computations.
This reduced numerical errors
which occasionally produced base pairing probabilities > 1
for long (~100kbase) sequences.
0.1 17-June-2007 Initial Release