Program name: Rfold Author: Hisanori Kiryu Computational Biology Research Center, The National Institute of Advanced Industrial Science and Technology (AIST), Japan. E-mail: kiryu-h AT aist.go.jp License: The source files in ./src/vienna/ directory contain the energy parameters extracted from the Vienna RNA package (version 1.5) (see Reference below) Please follow ./src/vienna/COPYING file, which describes the copyright notice of the Vienna RNA package. The other part of the source codes is provided as free software. They are distributed in the hope that they will be useful but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Permission is granted for research, educational, and commercial use and modification so long as 1) the package and any derived works are not redistributed for any fee, other than media costs, 2) proper credit is given to the author and the Computational Biology Research Center, AIST, Japan. If you want to include this software in a commercial product, please contact the author. Citation: Rfold: An exact algorithm for computing local base pairing probabilities Hisanori Kiryu; Taishin Kin; Kiyoshi Asai Bioinformatics 2007; doi: 10.1093/bioinformatics/btm591 Reference: Hofacker, I. (2003) Vienna RNA secondary structure server. Nucleic Acids Res., 31, 342931. Install: The program was tested using gcc 4.0 on linux machines and gcc 3.4 on cygwin Some gcc specific features are currently used. cd ./src/ make cd ../ Usage: run_rfold [options] seqfile seqfile: sequence file in fasta format. options: -command= COMPUTE_MEA_FOLD predicts local secondary structures by using the maximal expected accuracy (MEA) method. COMPUTE_PROB computes only the base pairing and loop probabilities. (default: COMPUTE_MEA_FOLD) -max_pair_dist= set the maximal allowed spans W of base pairs. set to (-1) for the computation without the constraint on the maximal span. (default: -1) -outfile= set the output file for structure predictions. (default: rfold_out.txt) -mea_outer_loop_coeff= compositional weight Co for outer bases. (default: 1.25) The compositional weight Cp for base pairs is always set to Cp=1.0 -mea_inner_outer_ratio= set the ratio Ci/Co of the compositional weights Ci, Co for the inner and outer unpaired bases, respectively. (default: 0.75) -print_prob= set to true in order to print out the base pair probabilities when -command=COMPUTE_MEA_FOLD (default: false) -print_loop_prob= set to true to print out loop probabilities Prob_L(i) (default: false) -prob_file= set output file for base pair and loop probabilities (default: rfold_prob.txt) Example: ./src/run_rfold -max_pair_dist=30 -print_prob=true ./samples/RNaseP_nuc.fa Output: Both output files for structure predictions and base pairing probability computations consist of multiple lines of tab delimited columns. Sequence positions are represented in 1-based coordinate. Structure prediction: column 1) start position of the inner region 2) last position of the inner region 3) MEA score of the inner region 4) secondary structure prediction Example (see ./samples/rfold_out.txt): #start end score structure 10 26 3.529640197753906 <<<<<......>>.>>> 80 92 2.522617340087891 <<<<.....>>>> 94 108 3.849872589111328 <<<<<.....>>>>> Probability computation: column 1) left base position of the base pair 2) right base position of the base pair In the case of loop probabilities columns 1 and 2 are identical. 3) probability value 4) probability type: 'P' for base pairing probabilities, 'L' for loop probabilities. Example (see ./samples/rfold_prob.txt): #left_pos right_pos prob type 316 320 2.0778e-06 P 313 317 3.64299e-05 P 313 318 0.797102 P 313 319 0.000241128 P 312 316 4.03667e-07 P 312 317 1.05298e-06 P 312 318 9.13579e-05 P Version History: 0.1-2 07-December-2012 removed duplicated probability output in prob_file 0.1-1 28-January-2008 Changed compile flag to use 'double' numbers (rather than 'float') in all the computations. This reduced numerical errors which occasionally produced base pairing probabilities > 1 for long (~100kbase) sequences. 0.1 17-June-2007 Initial Release