fRNAdb::Help

Introduction

fRNAdb is a database of comprehensive non-coding RNA (ncRNA) sequences including known (or previously reported) ncRNAs, which are acquired from other sequence databases (listed in Table 1), and ncRNA sequences reported by the joint research groups of the Functional RNA Project, funded by the New Energy and Industrial Technology Development Organization (NEDO).

Although the version number is 3.0, the current fRNAdb is a complete rebuild of the former version described in [1].

Sequence Data Representation

fRNAdb stores only nucleotide sequences which are represented with IUPAC nucleic acid symbols (see this page for a list of the symbols). Moreover fRNAdb is an RNA database where 'T' is useful over 'U' because 'T' is more common than 'U'. Capital letters are used for the sequences.

Sequence and Identifier

Sequences are assessed to make a set of unique sequences. Every sequence is assigned to a unique identifier composed of a prefix "FR" and a successive six-digit number (e.g., FR123456). One identifier represents a unique sequence.

Accession Number

Some sequences have one or more GenBank accession number(s).

Description

A brief description is given for every sequence. This is usually a pure import from the original databases for the known ncRNAs. However, we supply our relatively simple description to a known ncRNA when its original description is blank. Currently, no indication is available to segregate the descriptions of the original databases and ours. This will be addressed in the near future.

Sequence Ontology (SO)

Each sequence is classified to one of the categories defined by the sequence ontology (SO). Category selection is performed manually. Even though each sequence is carefully classified, there might always be a second opinion since SO is an evolving ontology especially for ncRNAs. However, we will continue improving our annotations as the situation is updated.

Source Organism

A sequence has one or more source organism(s). It is not that a single sequence is acquired from multiple organisms but that multiple transcripts/genomic elements are represented as a single sequence.

Cross Reference

Many fRNAdb sequences have their source databases. fRNAdb provides links to the source database as well as their source IDs.

Genome Mapping

Every sequence is mapped to several genomes of species supported by our UCSC GenomeBrowser for Functional RNAs irrespective of the source species in order to facilitate finding homologous regions in different organisms. Mapping information is given in two ways: one is through the use of a conventional chromosome-start-end-position notation and the other is through the utilization of a cytoband representation (e.g., Xq26.2). You can search sequences using a cytoband.

Gene Association

By using this mapping information, some sequences are associated with protein coding genes. For example, some ncRNAs are mapped inside of an intron of a gene, or there is overlapping in the antisense strand of a 3'UTR. fRNAdb provides gene association information when such a gene exists. Currently, UCSC Gene or RefSeq IDs of associating genes and links to UCSC GenomeBrowser for Functional RNA are supported.

Sequence Similarity

Every sequence is compared with the other sequences in this database for sequence similarity analysis. Sequence similarity information is provided for every sequence.

Reference Information

Every sequence is associated with one or more articles closely related the sequence. Titles and abstracts are available, which are retrieved from PubMed database. You can search sequences using words contained in these titles and abstracts and links to the full texts are also supplied. However, full text availability depends on your subscription status. We do not provide any copy of the full text articles.

Table 1. Databases

Database URL Reference
FANTOM3http://fantom3.gsc.riken.jp/2
H-invDB rel. 5.0http://www.h-invitational.jp/3
miRBase v10.0http://microrna.sanger.ac.uk/4
NONCODE v1.0http://www.noncode.org/5
Rfam v8.1http://www.sanger.ac.uk/Software/Rfam/6
RNAdb v2.0http://research.imb.uq.edu.au/rnadb/7
snoRNA-LBME-db rel. 3http://www-snorna.biotoul.fr/8
Gene Expression Omnibus (GEO)
(partial import)
http://www.ncbi.nlm.nih.gov/geo/9

Services

Two major services are provided by fRNAdb: text search and blast search.

Text Search

Text search allows you to retrieve your target sequences and their related information using search terms associated with the targets. Various terms are associated with a sequence in order to support multiple ways of searching a sequence. The terms are accumulated from the identifier/accession, description text, SO category, source organism, genome mapping results (cytobands), and titles and abstracts of the reference papers. This makes fRNAdb a powerful tool for retrieving RNA sequences associated with specific medical/biological terms such as a disease, a tissue or a locus.
When you place several terms to perform your search, the words are connected with the AND operator by default. For example, keywords B-cell lymphoma are designed to retrieve sequences associated with the words B-cell AND lymphoma. If you would like to have search results associated with either B-cell OR lymphoma, please specify the OR operator explicitly for example, B-cell OR lymphoma, which generates much more hits. If you would like to query a term consisting of several words such as "conserved noncoding region", please double quote your query, e.g., "conserved noncoding region". More details about boolean operators are described next.

Keyword Sources
identifier, description, accession, sequence ontology, organism, cross reference, gene association, title/abstract/author of references, cytoband, length

Boolean Operators

Boolean operators are evaluated from left to right. You can change the order of the evaluation by enclosing individual terms in parentheses. Click on the Query Translation tab to see the canonical notation of your query.

AND Find all entries that contain BOTH terms.
OR Find all entries that contain EITHER of the terms.
NOT Find all entries that contain term 1 BUT NOT term 2.

Multiple Search Terms

Multiple terms separated by white spaces are implicitly joined with the AND operator. If you want them to be treated as a single term, you need to enclose these terms by a pair of double quotes. For example, "enhancing RNA" is treated as a single term.

Search via Mapping Information

You can search mapping information by adding the [map] qualifier. For example, the term human[map] searches entries mapped on the latest human genome. You can also specify chromosome number (e.g., "human chr1"[map]) and chromosome-start-end-position (e.g., "human chr1:250-500"[map]). However, chromosome number alone (e.g., chr1[map]) is not a valid search term. Cytoband can be used here to specify a specific genomic position (e.g., "human 20q11.21"[map]). Cytoband alone (e.g., 20q11.21[map]) cannot be a valid search term.

Search via Authors

You can search author information by adding the [author] qualifier (e.g., Okida[author] or "Okida H"[author]). You do not need to use a punctuation.

Default Search Fields

When no special qualifier is specified, keyword search is performed on the default search fields which include identifier/accession, description, original database, source organism, SO, title/abstract/author. Table 2 shows an extensive list of available qualifiers.

Table 2. Search Field and Qualifiers

Search Field Definition Qualifier
ID Contains the unique identifier of a sequence.
example: FR000001[id]
[id]
Accession Contains the GenBank accession numbers associated with a sequence.
example: AF123456[acc]
[acc]
Description Description text that summarizes the feature of a sequence.
example: tRNA[desc]
[desc]
Xref ID Cross reference identifiers associated with a sequence.
example: PIR212436[xrefid]
[xrefid] or [xid]
Gene Association ID Cross reference identifiers associated with a sequence.
example: TGM2[assoc]
[assoc]
Xref DB Source database names of a sequence.
example: RNAdb[xrefdb]
[xrefdb] or [xdb]
Organism Scientific and common name of organisms associated with a sequence.
example: human[organism]
[organism] or [org]
Sequence Ontology name Sequence Ontology (SO) name of a sequence.
example: antisense_RNA[soname]
[soname] or [so]
Sequence Ontology sub-categories SO sub-categories associated with a sequence.
example: snoRNA[sonames]
[sonames] or [sos]
Mapping Information Chromosome number where a corresponding sequence is mapped.
example: "human chr12"[map]
example: "human Xq26.2"[map]
example: "human chr12 250:500"[map]
example: "human chr12:25-500"[map]
[map]
Pubmed ID Contains the PubMed ids of a reference associated with a sequence.
example: 16683036[pmid]
[pmid]
Pubmed Author Contains the PubMed authors of a reference associated with a sequence.<family name>[auth] and <family name>,<initial of first name>[auth] are supported.
example: kent[author]
example: kent,J[author]
[author] or [auth]
Pubmed Title Contains the PubMed title of a reference associated with a sequence.
example: snoRNA[title]
[title] or [ti]
Pubmed Title or Abstract Contains the PubMed title or abstract of a reference associated with a sequence.
example: noncoding[tiab]
[tiab] or [ta]
Sequence Similarity Contains information (ID, GenBank accession number, SO name) associated with a similar sequence.
example: FR001058[similar_to]
[similar_to] or [sim]
Sequence Length Number of nucleotides contained in a sequence.
example: 21[len]
example: 21:23[len]
example: :21[len]
example: 21:[len]
[length] or [len]

Blast Search

Blast service allows you to perform sequence similarity search between your query sequence and fRNAdb's sequence set. We use BLASTN (NCBI version; http://www.ncbi.nlm.nih.gov/). Due to the large amount of small sequences stored in fRNAdb, we prepared two separate Blast databases: sequences longer than 50 bases and those less than 50 bases. The default options for the Blast search are automatically changed when you switch the Blast database in order to provide optimal options for shorter sequences. You can override this by replacing "auto" with your value.

Database Citation

When you cite the Functional RNA Database, please cite [10].

References