How to find all miRNA locations in on GRch37 ?
1
2
Entering edit mode
10.2 years ago
Aurelie MLB ▴ 360

Hello,

I am trying to get the location of miRNAs on the human genome GRch37 from UCSC.

I tried to use bioconductor to do this. According to documentation, if I understand well you do:

library("TxDb.Hsapiens.UCSC.hg19.knownGene")
library(mirbase.db)
microRNAs(TxDb.Hsapiens.UCSC.hg19.knownGene)

But I get an empty TranscriptDB...:

GRanges with 0 ranges and 1 metadata column:
seqnames ranges strand | mirna_id
<Rle> <IRanges> <Rle> | <character>
---
seqlengths:
chr1 chr2 ... chrUn_gl000249
    249250621 243199373 ... 38502

I investigated and found that:

  • for TxDb.Hsapiens.UCSC.hg19.knownGene, I have: "miRBase build ID: GRCh37"
  • for mirbase.db: supportedMiRBaseBuildValues() gives me:"Homo sapiens GRCh37.p5"

I fear that there is an incompatibility here and I do not know what to do to make it work. Would you have any idea please?

If not, would you know where I could getthe locations for the microRNA on GRch37 please??

Many thanks!

miRNA Bioconductor genome • 6.3k views
ADD COMMENT
9
Entering edit mode
10.2 years ago

You can take them from miRBASE. The current release is in GRCh38 but you have also the previous releases in GRCh37 both in gff2 and gff3 (gff3 contains also rows for mature regions)

ftp://mirbase.org/pub/mirbase/20/genomes/hsa.gff2

##gff-version 2
##date 2013-05-24
#
# Chromosomal coordinates of Homo sapiens microRNAs
# microRNAs                 miRBase v20
# genome-build-id           GRCh37.p5
# genome-build-accession    NCBI_Assembly:GCA_000001405.6
#
chr1 . miRNA 17369 17436 . - . ACC="MI0022705"; ID="hsa-mir-6859-1";
chr1 . miRNA 30366 30503 . + . ACC="MI0006363"; ID="hsa-mir-1302-2";
chr1 . miRNA 567705 567793 . - . ACC="MI0022558"; ID="hsa-mir-6723";
chr1 . miRNA 1102484 1102578 . + . ACC="MI0000342"; ID="hsa-mir-200b";
ftp://mirbase.org/pub/mirbase/20/genomes/hsa.gff3 # with both the gene and the mature coords (mimat id)
##gff-version 3
##date 2013-10-1
#
# Chromosomal coordinates of Homo sapiens microRNAs
# microRNAs:               miRBase v20
# genome-build-id:         GRCh37.p5
# genome-build-accession:  NCBI_Assembly:GCA_000001405.6
#
# Hairpin precursor sequences have type "miRNA_primary_transcript". 
# Note, these sequences do not represent the full primary transcript, 
# rather a predicted stem-loop portion that includes the precursor 
# miRNA. Mature sequences have type "miRNA".
#
chr1    .   miRNA_primary_transcript    17369   17436   .   -   .   ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1
chr1    .   miRNA   17409   17431   .   -   .   ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705
chr1    .   miRNA   17369   17391   .   -   .   ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705
chr1    .   miRNA_primary_transcript    30366   30503   .   +   .   ID=MI0006363;Alias=MI0006363;Name=hsa-mir-1302-2
chr1    .   miRNA   30438   30458   .   +   .   ID=MIMAT0005890;Alias=MIMAT0005890;Name=hsa-miR-1302;Derives_from=MI0006363

You can obtain them also programatically with biomart from the stable GRCh37 ensembl

XML query:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >

    <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
        <Filter name = "biotype" value = "miRNA"/>
        <Attribute name = "ensembl_gene_id" />
        <Attribute name = "ensembl_transcript_id" />
        <Attribute name = "external_gene_id" />
        <Attribute name = "external_gene_db" />
        <Attribute name = "chromosome_name" />
        <Attribute name = "start_position" />
        <Attribute name = "end_position" />
        <Attribute name = "strand" />
        <Attribute name = "gene_biotype" />
        <Attribute name = "transcript_biotype" />
    </Dataset>
</Query>

You can save it as biomart_mirna.xml

And retrieve with

curl --data-urlencode query@biomart_mirna.xml http://grch37.ensembl.org/biomart/martservice/results > mirna_genes.tsv

The results are slightly different though.

ADD COMMENT
1
Entering edit mode

Hi Pablo. Thanks a lot for this !

ADD REPLY

Login before adding your answer.

Traffic: 1930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6