fRNAdb is a database of comprehensive non-coding RNA (ncRNA) sequences including known (or previously reported) ncRNAs, which are acquired from other sequence databases (listed in Table 1), and ncRNA sequences reported by the joint research groups of the Functional RNA Project, funded by the New Energy and Industrial Technology Development Organization (NEDO).
Although the version number is 3.0, the current fRNAdb is a complete rebuild of the former version described in .
Sequence Data Representation
fRNAdb stores only nucleotide sequences which are represented with IUPAC nucleic acid symbols (see this page for a list of the symbols). Moreover fRNAdb is an RNA database where 'T' is useful over 'U' because 'T' is more common than 'U'. Capital letters are used for the sequences.
Sequence and Identifier
Sequences are assessed to make a set of unique sequences. Every sequence is assigned to a unique identifier composed of a prefix "FR" and a successive six-digit number (e.g., FR123456). One identifier represents a unique sequence.
Some sequences have one or more GenBank accession number(s).
A brief description is given for every sequence. This is usually a pure import from the original databases for the known ncRNAs. However, we supply our relatively simple description to a known ncRNA when its original description is blank. Currently, no indication is available to segregate the descriptions of the original databases and ours. This will be addressed in the near future.
Sequence Ontology (SO)
Each sequence is classified to one of the categories defined by the sequence ontology (SO). Category selection is performed manually. Even though each sequence is carefully classified, there might always be a second opinion since SO is an evolving ontology especially for ncRNAs. However, we will continue improving our annotations as the situation is updated.
A sequence has one or more source organism(s). It is not that a single sequence is acquired from multiple organisms but that multiple transcripts/genomic elements are represented as a single sequence.
Many fRNAdb sequences have their source databases. fRNAdb provides links to the source database as well as their source IDs.
Every sequence is mapped to several genomes of species supported by our UCSC GenomeBrowser for Functional RNAs irrespective of the source species in order to facilitate finding homologous regions in different organisms. Mapping information is given in two ways: one is through the use of a conventional chromosome-start-end-position notation and the other is through the utilization of a cytoband representation (e.g., Xq26.2). You can search sequences using a cytoband.
By using this mapping information, some sequences are associated with protein coding genes. For example, some ncRNAs are mapped inside of an intron of a gene, or there is overlapping in the antisense strand of a 3'UTR. fRNAdb provides gene association information when such a gene exists. Currently, UCSC Gene or RefSeq IDs of associating genes and links to UCSC GenomeBrowser for Functional RNA are supported.
Every sequence is compared with the other sequences in this database for sequence similarity analysis. Sequence similarity information is provided for every sequence.
Every sequence is associated with one or more articles closely related the sequence. Titles and abstracts are available, which are retrieved from PubMed database. You can search sequences using words contained in these titles and abstracts and links to the full texts are also supplied. However, full text availability depends on your subscription status. We do not provide any copy of the full text articles.
Table 1. Databases
|H-invDB rel. 5.0||http://www.h-invitational.jp/||3|
|snoRNA-LBME-db rel. 3||http://www-snorna.biotoul.fr/||8|
|Gene Expression Omnibus (GEO)|
Two major services are provided by fRNAdb: text search and blast search.
Text search allows you to retrieve your target sequences and their related information
using search terms associated with the targets.
Various terms are associated with a sequence in order to support multiple ways of searching a sequence.
The terms are accumulated from the identifier/accession, description text, SO category,
source organism, genome mapping results (cytobands), and titles and abstracts of the reference papers.
This makes fRNAdb a powerful tool for retrieving RNA sequences associated with specific medical/biological terms such as
a disease, a tissue or a locus.
When you place several terms to perform your search, the words are connected with the AND operator by default. For example, keywords
B-cell lymphoma are designed to retrieve sequences associated with the words B-cell AND lymphoma.
If you would like to have search results associated with either B-cell OR lymphoma,
please specify the OR operator explicitly for example,
B-cell OR lymphoma, which generates much more hits.
If you would like to query a term consisting of several words such as "conserved noncoding region", please double quote your query, e.g.,
"conserved noncoding region".
More details about boolean operators are described next.
|identifier, description, accession, sequence ontology, organism, cross reference, associatin overlap gene, title/abstract/author of references, cytoband, length, omim|
Boolean operators are evaluated from left to right. You can change the order of the evaluation by enclosing individual terms in parentheses. Click on the Query Translation tab to see the canonical notation of your query.
|AND||Find all entries that contain BOTH terms.|
|OR||Find all entries that contain EITHER of the terms.|
|NOT||Find all entries that contain term 1 BUT NOT term 2.|
Multiple Search Terms
Multiple terms separated by white spaces are implicitly joined with the AND operator. If you want them to be treated as a single term, you need to enclose these terms by a pair of double quotes. For example, "enhancing RNA" is treated as a single term.
Search via Organism
You can search organism information by adding the [organism] qualifier. If you would like to query fuzzy term, please percent your query, e.g., "human adenovirus%".
Search via Mapping Information
You can search mapping information by adding the [map] qualifier. For example, the term human[map] searches entries mapped on the latest human genome. You can also specify chromosome number (e.g., "human chr1"[map]) and chromosome-start-end-position (e.g., "human chr1:250-500"[map]). However, chromosome number alone (e.g., chr1[map]) is not a valid search term. Cytoband can be used here to specify a specific genomic position (e.g., "human 20q11.21"[map]). Cytoband alone (e.g., 20q11.21[map]) cannot be a valid search term.
Search via Authors
You can search author information by adding the [author] qualifier (e.g., Kent[author] or "Kent J"[author]). You do not need to use a punctuation.
Default Search Fields
When no special qualifier is specified, keyword search is performed on the default search fields which include identifier/accession, description, original database, source organism, SO, title/abstract/author. Table 2 shows an extensive list of available qualifiers.
Table 2. Search Field and Qualifiers
|ID||Contains the unique identifier of a sequence.
|Accession||Contains the GenBank accession numbers associated with a sequence.
|Description||Description text that summarizes the feature of a sequence.
|Organism||Scientific and common name of organisms associated with a sequence.
|[organism] or [org]|
|Sequence Ontology name||Sequence Ontology (SO) name of a sequence.
|[soname] or [so]|
|Sequence Ontology sub-categories||SO sub-categories associated with a sequence.
|[sonames] or [sos]|
|Xref ID||Cross reference identifiers associated with a sequence.
|[xrefid] or [xid]|
|Xref DB||Source database names of a sequence.
|[xrefdb] or [xdb]|
|Association Gene Overlap 5'UTR||A gene symbol where a sequence overlap with its 5'UTR.
|Association Gene Overlap exon||A gene symbol where a sequence overlap with its exon.
|Association Gene Overlap intron||A gene symbol where a sequence overlap with its intron.
|Association Gene Overlap 3'UTR||A gene symbol where a sequence overlap with its 3'UTR.
|Association Gene Overlap sense||A gene symbol where a sequence overlaps in its sense strand.
example: %[ov_sense] AND %[ov_5utr] (sequences that overlap with 5'UTRs in sense strand)
|Association Gene Overlap antisense||A gene symbol where a sequence overlaps in its anti-sense strand.
example: %[ov_anti] AND %[ov_3utr] (sequences that overlap with 3'UTRs in anti-sense strand)
|Association Gene Neighbor UpStream||A gene symbol located upstream where a corresponding sequence is mapped.
|[upstream] or [us]|
|Association Gene Neighbor DownStream||A gene symbol located downstream where a corresponding sequence is mapped.
|[dwstream] or [ds]|
|Mapping Information||Chromosome number where a corresponding sequence is mapped.
example: "human chr12"[map]
example: "human Xq26.2"[map]
example: "human chr12 250:500"[map]
example: "human chr12:25-500"[map]
|Pubmed ID||Contains the PubMed ids of a reference associated with a sequence.
|Pubmed Author||Contains the PubMed authors of a reference associated with a sequence.<family name>[auth] and <family name>,<initial of first name>[auth] are
|[author] or [auth]|
|Pubmed Title||Contains the PubMed title of a reference associated with a sequence.
|[title] or [ti]|
|Pubmed Title or Abstract||Contains the PubMed title or abstract of a reference associated with a sequence.
|[tiab] or [ta]|
|Sequence Similarity||Contains information (ID, GenBank accession number, SO name) associated with a similar sequence.
|[similar_to] or [sim]|
|Sequence Length||Number of nucleotides contained in a sequence.
|[length] or [len]|
|OMIM||OMIM id, title and synonym associated to a sequence.
example: LYSOPHOSPHATIDIC[omim] (word match)
example: %[omim] (retrieve entire sequences associated to OMIM entries)
|Affymetrix GeneChip Exon Array ver 1.0||Affymetrix array probe associated to a sequence.
example: %[affy] (retrieve entire sequences associated to Affymetrix GeneChip Exon Array)
|Invitrogen Ncode Noncoding RNA Array||Invigrogen array probe associated to a sequence.
example: %[ivgn] (retrieve entire sequences associated to Invitrogen Ncode Noncoding RNA Array)
Blast service allows you to perform sequence similarity search between your query sequence and fRNAdb's sequence set. We use BLASTN (NCBI version; http://www.ncbi.nlm.nih.gov/). Due to the large amount of small sequences stored in fRNAdb, we prepared two separate Blast databases: sequences longer than 50 bases and those less than 50 bases. The default options for the Blast search are automatically changed when you switch the Blast database in order to provide optimal options for shorter sequences. You can override this by replacing "auto" with your value.
When you cite the Functional RNA Database, please cite .