Retrieve the structure information from mirbase
1
0
Entering edit mode
5 weeks ago
Coleman • 0

I would like to retrieve the structure information from mirbase. For example, for the entry "hsa-mir-200c", its page in mirbase website version https://mirbase.org/hairpin/MI0000650 show the structure information in both dot bracket representation.

cccuCGUCUUACCCAGCAGUGUUUGGgugcgguugggagucucUAAUACUGCCGGGUAAUGAUGGAgg
.(((((((((((((.((((((((.(((.((........))))).)))))))).)))))).))).))))

And a more graphical representation.

c    -   -      A        U   u  ggu 
 ccuC GUC UUACCC GCAGUGUU GGg gc   u
 |||| ||| |||||| |||||||| ||| ||    
 ggAG UAG AAUGGG CGUCAUAA cuc ug   g
-    G   U      C        U   -  agg

Q1: I would like to retrieve the structure in dot bracket representation. But I cannot find such information in the downloaded version of the database "miRNA.dat" . How to retrieve such structure information.

Q2: Besides, I would like to ask whether the structure information in the webpage is obtained from a web lab experiment or simply by some computational prediction models like RNAfold.

Thanks for your attention.

structure miRBase • 406 views
ADD COMMENT
2
Entering edit mode
4 weeks ago

Look at the source code of the html page: https://mirbase.org/hairpin/MI0000650

How is the hairpin structure generated? 
 We have generated a dot-bracket structure for each sequence using RNAfold.
                Unambiguous secondary structure.
                Parsed and ASCII art drawn.
                Want the script? Email us or visit Downloads or something.

so, as there is nothing in the download section, email them :-)

ADD COMMENT
0
Entering edit mode

Thanks for your comment. I must say they keep the above message in a place that is very difficult to see. Besides, I find that the hairpin structure they generated may be wrong. For example, for hsa-let-7a-1 (https://mirbase.org/hairpin/MI0000060) The sequence is

ugggaUGAGGUAGUAGGUUGUAUAGUUuuagggucacacccaccacugggagauaaCUAUACAAUCUACUGUCUUUCcua
(((((.(((((((((((((((((((((.....(((...((((....)))).)))))))))))))))))))))))))))))

While the structure is

     U                     uuagg   aca    c 
uggga GAGGUAGUAGGUUGUAUAGUU     guc   ccca c
||||| |||||||||||||||||||||     |||   ||||  
aucCU UUCUGUCAUCUAACAUAUCaa     uag   gggu a
     -                     -----   --a    c

The third loop counting from the left seems to be wrong. From the dot bracket notation, aca corresponds to ... But in the diagram, aca is paired to --a but it should be ---

And I use a package called ViennaRNA (https://github.com/ViennaRNA/ViennaRNA/), which is also available on Python, to construct the dot-bracket structure for the sequence.

ADD REPLY

Login before adding your answer.

Traffic: 1242 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6