Question

How To Include Biological And Technical Replicates While Using Mirdeep2 For Mirna Prediction?

2

Entering edit mode

11.0 years ago

Jordan ★ 1.3k

Hi,

I'm planning to analyze RNA-Seq data of mouse. I would like find miR's in it and I realized that one of the tools for that is miRDeep2. I read the through the documentation but could not find any information regarding how to include biological replicates or technical replicates.

For e.g., to map single fastq file using mapper.pl I would use the following command:

mapper.pl -e sampleA.fastq -p mm9 -t sampleAmapping.arf

But how do it for biological replicates? For e.g, sampleA has another biological replicated called sampleB. How would I include this in analysis?

Otherwise, I have mapped all the samples using Tophat2 and the mapping file is in bam format. Is there a way to convert it to .arf format?

Thanks!

mirna • 6.6k views

ADD COMMENT • link updated 9.3 years ago by Biostar 20 • written 11.0 years ago by Jordan ★ 1.3k

0

Entering edit mode

Alignment Failed.

mapper.pl config.txt -d -e -h -i -j -l 18 -m -p Mus_musculus.GRCm38.74 -s reads.fa -t reads_vs_genome.arf -v -u -n

mapping reads to genome index
# reads processed: 4676885
# reads with at least one reported alignment: 8353 (0.18%)
# reads that failed to align: 4666847 (99.79%)
# reads with alignments suppressed due to -m: 1685 (0.04%)
Reported 14385 alignments to 1 output stream(s)
trimming unmapped nts in the 3' ends
Mapping statistics

#desc    total    mapped    unmapped    %mapped    %unmapped
total: 115234760    224420    115010340    0.002    0.998
CN1: 51022439    14690    51007749    0.000    1.000
CN2: 64212321    209730    64002591    0.003    0.997

config.txt:

sequence1.txt CN1
sequence2.txt CN1
sequence3.txt CL1
sequence4.txt CL1

What is possible cause of this low read alignments?

I have tried with -k TCGTATGCCGTCTTCTGCTTGT but the alignment does't improve. Not sure if the adapter sequence is correct.

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 10.6 years ago by bob • 0

Ram · Answer 1 · 2014-01-13

5

Entering edit mode

11.0 years ago

IV ★ 1.3k

In miRDeep you can create a config file containing multiple samples.

For instance you can make a config.txt containing the following lines (filename and 3 letter code):

wt1.fastq WT1
wt2.fastq WT2
wt3.fastq WT3
ko1.fastq KO1
ko2.fastq KO2

and you could call mapper.pl as follows (example):

mapper.pl config.txt --d --e --h --i --j --l 17 --m --p genome.index --s reads.fa --t reads vs genome.arf -v --o 4 --q

This is the "official" approach to this and this is definitely the way to go for quantification and for miRNA DE.

You could also make a config.txt only with the WTs and another with the KOs. This depends on your study design actually. I'll paste below the relevant passage from the online help.

Cheers,
IV

PS the relevant text from mirDeep2 doc

The user has sequencing data from different samples e.g. different cell-types. A config.txt file has to be created in which each line designates file locations and a unique 3 letter code. For instance:

sequencing_data_sample1.fa  sd1
sequencing_data_sample2.fa  sd2
sequencing_data_sample3.fa  sd3

The user wishes then to pool these files and use the generated files reads.fa and reads_vs_genome.fa for the miRDeep2 analysis.

mapper.pl config.txt -d -c -i -j -l 18 -m -p genome_index -s reads.fa -t reads_vs_genome.arf

Since the reads_vs_genome.arf still contains the 3 letter code for each read mapped to genome the user can then later on dilute the contribution of the different samples to a predicted or known miRNA. It can also be used for example to define 'high confident' predictions if the results are filtered for miRNAs that have sequencing evidence from at least two samples.

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 11.0 years ago by IV ★ 1.3k

0

Entering edit mode

Thanks for the suggestions. Can you explain the reason behind giving it as -b option, which is for qseq.txt format? The files given are fastq format aren't they?

ADD REPLY • link 11.0 years ago by Jordan ★ 1.3k

0

Entering edit mode

You're absollutely right.Sorry for that.

I pasted the code from an old file and it was not for fastq data. I pasted a corrected version of the call.

ADD REPLY • link 11.0 years ago by IV ★ 1.3k

0

Entering edit mode

my config.txt file looks like this:

QC/TRIM/A1_S1_QC_TRIM.fastq S01 
QC/TRIM/A2_S2_QC_TRIM.fastq S02
QC/TRIM/A3_S3_QC_TRIM.fastq S03

calling mapper like this:

mapper.pl config.txt -d -e -h -m -j -v -n -l 18 -p Amel4.5 -s QC/reads3.fa -t QC/rdvdb3.arf

get this error:

No reads file in fasta format given

I've called mapper.pl the same way but with all the files individually and it works fine.

mapper.pl QC/TRIM/A1_S1_QC_TRIM.fastq -e -h -m -j -v -n -l 18 -p Amel4.5 -s QC/A1_S1_col_read2.fa -t QC/A1_S1_rdvdb2.arf

Can't understand what is wrong. Any help appreciated.

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 7.4 years ago by wcjasper • 0

0

Entering edit mode

Solution: The sequence file and the three letter code needs to be tab-separated.

ADD REPLY • link 7.4 years ago by wcjasper • 0

0

Entering edit mode

Also, in case anyone runs into this like I did, the sequence file and the three letter code needs to be tab-separated.

ADD REPLY • link 7.4 years ago by wcjasper • 0