Hello,
I'm having troubles with the IDs in miRbase-context file. In particular, I have a list of around 300 miRNAs with a miRbase ID (i.e. hsa-mir-100). All I want to know is their genomic location: 3'UTR, Intron, Exon or 5'UTR. To do so, I downloaded from the FTP site of miRbase the file miRNA_context. This is how it looks like:
82087 ENSACAT00000008610 + exon 11 HGNC_trans_name
82088 ENSACAT00000013598 + 3UTR 11 miRBase_trans_name
82089 ENSACAT00000003614 + exon 11 HGNC_trans_name
82092 ENSACAT00000019163 + exon 1 miRBase_trans_name
82094 ENSACAT00000020806 + exon 1 miRBase_trans_name
82096 ENSACAT00000004565 + intron 58 HGNC_trans_name
And here a brief explanation of the file from miRbase FAQs: "... Once open, you will see that there are numerous columns corresponding to (left to right): miRNA ID, transcript ID, + or - (according whether the sequence lies on the sense or antisense strand, respectively, of a stem-loop structure), where the miRNA originates (i.e. intron, exon, 5' UTR or 3'UTR), the number of the exon or intron from which the miRNA originates, the transcript source, and the transcript name."
Next, I converted my "miRbase ID" in "Transcript stable ID" using Biomart. But then, when I try to intersect these two lists using the transcript IDs, I only find 3 in common.
Does anyone have an idea which could be the problem?
Thank you in advance
Thank you Emily, I thought these transcript ID from miRbase were the transcript IDs of the host genes... it makes sense what you explained! Thank you again
Actually, now that I think more about it, These transcript IDs from miRbase have to be the IDs of the miRNAs... Infact the FAQ I was referring to in the question was: " How can I get a list of intronic miRNAs?" http://www.mirbase.org/help/FAQs.shtml#How%20can%20I%20get%20a%20list%20of%20intronic%20miRNAs?
And they suggest you to download the miRNAs_context and then query the table to identify all the miRNAs which originate from introns. Did I understand right? Sorry for the confusion, I hope this is clear..