How can I download the list of noncoding RNA and their annotations for human genome. Does any one has Encode link or any other database. Thanks.
How can I download the list of noncoding RNA and their annotations for human genome. Does any one has Encode link or any other database. Thanks.
The ucsc browser file ;
#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds
4 TCONS_00000720 chr1 - 209701799 209741018 209741018 209741018 7 209701799,209704591,209714042,209714652,209719401,209740254,209740879, 209702024,209704719,209714133,209714774,209719484,209740331,209741018,
4 TCONS_00000721 chr1 - 209701885 209720925 209720925 209720925 6 209701885,209714042,209714652,209716190,209719357,209720836, 209702024,209714133,209714774,209716263,209719484,209720925,
10 TCONS_l2_00001979 chr1 + 13629937 13635298 13635298 13635298 4 13629937,13632322,13633030,13634720, 13629979,13632634,13633609,13635298,
14 TCONS_l2_00001219 chr1 - 48226249 48260426 48260426 48260426 4 48226249,48240841,48244125,48260257, 48226952,48241111,48244216,48260426,
I believe txStart
and txEnd
should be the coordinates for linc RNA? Genome browser has no explanation? Am I correct
Thanks
Check describe table schema (http://genome.ucsc.edu/cgi-bin/hgTables). Tx start is transcription start position.
I am sorry then which are the coordinates for linc RNA or is it that exons are given anything outside that maybe intergenic?
bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds
585 TCONS_l2_00002359 chr1 - 16606 29370 29370 29370 8 16606,16853,17232,17605,17914,18267,24737,2932
I am using two annotations: Gencode and Cabili et al..
Still I didn't get what ncRNA you need. Is it only long intergenic ncRNA or you're interested in all long ncRNA (intergenic and antisesense)?
Poe,
I have another issue, excuse me for coming back with difficulties on the same issue-
when I downloaded GTF file of Gencode/ Cabili et al (Broad) it has lot of information how can I match gene annotation with bed file which has unique records.
e.g GTF file :
chr1 HAVANA exon 139790 139847 . - . gene_id XLOC_000658 ; transcript_id TCONS_00000437 ; exon_number 1 ; oId ENST00000493797 ; linc_name linc-ZNF692-6 ; tss_id TSS1017 ; class_code u ; gene_name linc-ZNF692-6 ;
chr1 HAVANA exon 140075 140339 . - . gene_id XLOC_000658 ; transcript_id TCONS_00000437 ; exon_number 2 ; oId ENST00000493797 ; linc_name linc-ZNF692-6 ; tss_id TSS1017 ; class_code u ; gene_name linc-ZNF692-6 ;
while bed file is-
chr1 139789 140339 TCONS_00000437 0 - 139789 140339 0,0,0 2 58,265, 0,285,
chr1 141473 149707 TCONS_00000438 0 - 141473 149707 0,0,0 2 1538,3322, 0,4912,
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Do you want lincRNA (as the title suggests) or all non-protein coding RNA (as the rest of the question suggests) ? There are quite a lot of available databases, easily found via a simple Web search.
How about only lin noncoding RNA only
For lincRNA please see Ido Tamir and my post.