Need Script For: Randomization Test For Predicted Mirnas
2
0
Entering edit mode
10.9 years ago
biolab ★ 1.4k

Hi everyone,

I have a file with many predicted miRNAs. I need to perform a randomization test to identify which of these miRNAs are highly probable. This test is to randomize each predicted miRNA 1000 times and calculate each randomized sequence's MFE value (this can be easily done by RNAfold). My current problem is how to generate 1000 radomized sequences for each miRNA? I make an example below.

Predicted miRNAs file

>miR1
augcgugaccguaugcuac
>miR2
uuuggugcguagucguacg
   ............
>miR100
auaugagucguacguacgu

Radomized sequences file

>miR1_1
ugcggaccguaugcuacau
>miR1_2
ugcggaccuugcuacauga
...........
>miR1_1000
ugcggaccguaugcuacua
.............
............
>miR100_1000
auaugagucgacguacu

Could anyone being familar with perl help to solve this problem? For the next steps including calculating MFE etc, I can do them myself. But I believe someone who know well RNAfold and miRNA prediction can produce a pipline for this work. I attached a Nucleic Acid Research reference link here http://nar.oxfordjournals.org/content/37/suppl_1/D111.full . In the method section, when searing RNAFOLD, you can find the authors' method. THANK YOU in advance!

script • 3.5k views
ADD COMMENT
3
Entering edit mode

Instead of looking for a script to generate random sequences, may be you can use "off the shelf" tools to generate random sequences from your input sequence. Take a look at biosquid package and especially shuffle. (these are Debian/Ubuntu packages and I am not sure if you can find an alternative for other distro or windows and I've not tried it myself)

ADD REPLY
2
Entering edit mode

Yes; I'd use EMBOSS shuffleseq, which can easily be run from a (Bio)Perl (or other) script if required.

ADD REPLY
3
Entering edit mode
10.9 years ago

Just a quick advice. You should use the pre-miRNA sequence (the typical hairpin-like structure) and not the mature miRNA to fold the secondary structure. You can use Randfold ( http://bioinformatics.oxfordjournals.org/content/20/17/2911 ) to do that.

download Randfold : http://bioinformatics.psb.ugent.be/supplementary_data/erbon/nov2003/

ADD COMMENT
0
Entering edit mode

THANKS a lot for your information about the Randfold.

ADD REPLY
2
Entering edit mode
10.9 years ago
JC 13k
#!/usr/bin/perl

use strict;
use warnings;
use List::Util 'shuffle';

my $rep = 1000; # permutation per mirna

$/ = "\n>";
while (<>) {
    s/>//g;
    my ($id, $seq) = split (/\n/, $_);
    my @seq  = split (//, $seq);
    my %seen = ();
    for (my $n=1; $n<=$rep; $n++) {
        my @rand_seq = shuffle(@seq);
        my $new_seq  = join "", @rand_seq;
        next if (defined $seen{$new_seq}); # skip sequences already generated
        next if ($new_seq eq $seq); # skip if both are the same miRNA
        print ">$id\_$n\n$new_seq\n";
        $seen{$new_seq} = 1;
    }
}

Save as any perl script and run as: perl randMiRNA.pl < mirna.fasta > random_mirnas.fasta

ADD COMMENT
1
Entering edit mode

might be a more realistic null model to get the random sequence from a random spot in the genome--though it's not clear what the OP intends

ADD REPLY
1
Entering edit mode

I agree, also I can imagine a dimer permutation in miRNAs sites.

ADD REPLY
0
Entering edit mode

THANKS a lot for your script. brentp's suggestion is good, but I don't think to test a random sequence from intergenic and intron regions, because I am testing the second structure of already predicted miRNAs. From previous publications, they didn't test random genomic sequences. Anyway, your discussions are very good. JC, I have one more question: in your code you use

List::Util 'shuffle';

Would you please explain briefly about shuffle? Do i need to install biosquid? THANK YOU VERY MUCH!

ADD REPLY
1
Entering edit mode

List::Util is a core module in Perl, you don't need to install it. The "shuffle" function returns a list (array) in random order, I'm guessing it uses Fisher-Yates permutation but you can check the main algorithm in the source code: http://search.cpan.org/~pevans/Scalar-List-Utils-1.35/lib/List/Util.pm#shuffle_LIST

ADD REPLY
0
Entering edit mode

Hi JC, thank you very much for your explaination. Best regards!

ADD REPLY

Login before adding your answer.

Traffic: 917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6