How to get the position of a transcript on chromosome with ncbi refseq_ncrna id?
2
0
Entering edit mode
5.5 years ago

I have a list of ncbi refseq_ncrna id, for example NR_046233, NR_015516, NR_102721...I need the position of the transcripts on chromosome.

I have tried to convert the refseq id to ensembl id, and use R package ensembldb to get the chromosome location of the transcript, but a lot of refseq_ncrna ids don't have corresponding ensembl id.

Is there any way to directly get the chromosome location of the transcript use refseq_ncrna id?

Assembly RNA-Seq sequencing gene • 3.4k views
ADD COMMENT
1
Entering edit mode
5.5 years ago
c.chakraborty ▴ 180

Use BioMart from ENSEMBL. 1. Go to the ENSEMBL BioMART tool. 2.Choose database- usually I choose ENSEMBL genes 3.On the right hand, you will find Filters, click on it and then choose genes. 4. In the second drop down you will find - Input External references. Tick this one. On the left, there is a drop down box, select RefSeq ncRNA ID (there you have it! :)), and paste all the RefSeq IDs of nc RNAs of your interest. 5. Then on the left you will see attributes, click on that. 6. Then select Gene- here you can select chromosome/scaffold name, strand, transcription start site or length of transcript. And hopefully now you will have almost all information for your ncRNAs.

ADD COMMENT
0
Entering edit mode

Thanks a lot! although it still doesn't work for ncRNAs without ensembl transcript id

ADD REPLY
0
Entering edit mode

Did you choose Refseq ncRNA ID from the dropbox menu?

ADD REPLY
0
Entering edit mode

Yes I did, and it only gives results for those with ensemble transcript ids

ADD REPLY
1
Entering edit mode
5.5 years ago
GenoMax 148k

A couple of different options using Entrezdirect.

$ esearch -db nuccore -query "NR_015516" | elink -target gene | efetch -format docsum | xtract -pattern DocumentSummary -element AssemblyAccVer,ChrAccVer,ChrStart,ChrStop
GCF_000001635.24        GCF_000001635.24        GCF_000001635.23        GCF_000001635.23        GCF_000002165.2 GCF_000001635.18        GCF_000002165.2 NC_000075.6     NC_000075.6     NW_004450259.NC_000075.6      NW_004450259.2  AC_000031.1     NT_166313.1     AC_000031.1     124423250       124423250       124423250       191494  124423250       170853  124647848       53408   124648095    124424855        124424855       194625  124424855       167722  124651402       56292   124651402

or

$ esearch -db nuccore -query "NR_015516" | elink -target gene | efetch -format ft

1. 4930526I15Rik
Official Symbol: 4930526I15Rik and Name: RIKEN cDNA 4930526I15 gene [Mus musculus (house mouse)]
Other Aliases: 1600027J15Rik
Chromosome: 9; Location: 9
Annotation: Chromosome 9 NC_000075.6 (124423251..124424856)
ID: 75135
ADD COMMENT
0
Entering edit mode

Thanks! I tried this, and it works for some of the ncRNAs

$ esearch -db nuccore -query "NR_102721" | elink -target gene | efetch -format ft

1. 4930526I15Rik
Official Symbol: 4930526I15Rik and Name: RIKEN cDNA 4930526I15 gene [Mus musculus (house mouse)]
Other Aliases: 1600027J15Rik
Chromosome: 9; Location: 9
Annotation: Chromosome 9 NC_000075.6 (124423251..124424856)
ID: 75135

but shows a little information for others

$ esearch -db nuccore -query "NR_046233"  | elink -target gene | efetch -format ft

1. Rn45s
45S pre-ribosomal RNA [Mus musculus (house mouse)]
Chromosome: 17
ID: 100861531

I guess it just doesn't have the information of chromosome location for some of the ncRNAs.

ADD REPLY

Login before adding your answer.

Traffic: 1856 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6