What's the Contents of Genomes
The Contents of Genomes presents you a *basic* statistics of genome contents for human, mouse and rat. We assigned every base in a genome to a genomic content so that users can grasp how many bases are correspond to exons, introns, UTRs, repetitive elements , etc. at a glance. The assignment is based on the genome annotation information provided by the UCSC Genome Browser.
METHODS
Here is the procedure we used for the assignment of bases in a genome.
Every base is checked for the overlap with several essential tracks of UCSC Genome Browser. A base can overlap with several tracks, which cause multiple assignments for a base. However, we would like to assign one unique feature for a base. So we defined the assignment priority for every track we use. The highest priority track is checked first. The order of check is shown as follows:- the repetitive element is based on RepeatMasker track
- the tandem repeat is based on Simple Repeat track by Trandem Rpeat Finder
- the pseudo gene is based on Yalse Pseudo Gene track
- the known gene is based on UCSC Known Gene and RefSeq gene track
- the genscan gene is based on Genscan track for putative genes
- the small RNA annotation is based on RfamFull track, Jones and Eddy track, and Known miRNA track