DAG kernels for structural RNA analysis -- profile-profile stem kernels --

Build

You can make the kernel matrix calculator for the profile-profile stem kernels as follows:

./configure &&  make

Several options for the configure script can be specified. You can see more details with --help option.

The configure script will find required libraries:

If you have the MPI system, you may give '--with-mpi' option for the configure script to build the executable that can run in parallel.

Learning a Model

Typical usage of the stem kernel is the following command line:

% stem_kernel_lite -n km.dat +1 pos.fa -1 neg.fa

km.dat is the resulting pre-computed kernel matrix that can be accepted by LIBSVM (e.g. svm-train -t 4 -b 1 km.dat km.model), pos.fa and neg.fa are positive sequences and negative sequences, respectively, written in the FASTA format. The option -n makes km.dat nomalized to avoid sequence length bias.

The executable can accept the following sequence formats:

Furthermore, the filename expansion '*' can be accepted such as:

% stem_kernel_lite -n km.dat +1 'pos-*.aln' -1 'neg-*.aln'

Prediction by The Model

% stem_kernel_lite -n x.dat +1 pos.fa -1 neg.fa --test +1 seq1.fa -1 seq2.fa ....

Before the option --test, you should put the same options as when the model were learned. After the option --test, you put the class labels and sequence files to be predicted. Then, you can predict the class of sequences using LIBSVM (e.g. svm-predict -b 1 x.dat km.model output).

References

Contact

Kengo SATO