miRNA/smallRNA adapter triming: Any recommendations for adaptor error rate and sequence Phred?
2
2
Entering edit mode
6.1 years ago
ti&te ▴ 40

I am looking for recommendations on how to trim miRNA/smallRNA sequencing data because the trimming may affect the final results (differences are not enormous, but some miRNA are more prone to different parameters in trimming step).

For sequence data trim, I use cutadapt, with minimal sequence length 15, sequence end quality trim Phred20 before adaptor removal and error rate in adaptor detection 0.1 (cutadapt -m 15 -q 20 -e 0.1). The more stringent parameters with (q30 error rate 0.01) give less mature miRNA (logically), but higher differences in DE in downstream analysis.

It is known that miRNA need less stringent parameters due to more sequencing noise compared to other RNA and DNA data, so I would be grateful for sharing your experience.

rna-seq sequencing next-gen • 4.0k views
ADD COMMENT
1
Entering edit mode

I work with small RNA data. I think the value of m is very short especially if you are only looking into miRNAs. We typically have a min cutoff of 18 and a maximum cutoff of 34. Otherwise, the q and e values that you used are reasonable. Can you share some references talking about the need for miRNA data processing to use less stringent parameters.

ADD REPLY
0
Entering edit mode

Thank you for your reply. The reported and recommended miRNA length may differ and some miRNA analysis tools sort aligned reads to mature miRNA, isomiRNA and miRNA hairpins, so the shorter length should not be a problem for downstream analysis. http://bioinfo2.ugr.es/presentaciones/biocomputacion/microRNA_NGS.pdf (prof. Hackenberg's presentation)

As you have probably seen that FastQC of your data differ compared to other longer reads experiments - that is the result of a difference in length of your smallRNA-library products, remaining adaptor dimers and low diversity of library due to highly expressed miRNA in your samples.

Please find some publications with Q20 trimming before further processing. https://www.ncbi.nlm.nih.gov/pubmed/28934507 https://www.ncbi.nlm.nih.gov/pubmed/26027894

ADD REPLY
3
Entering edit mode
6.1 years ago
ahmad mousavi ▴ 800

Hi

I have used mirdeep2 for preprocessing/post processing, You could use following command for removing adapters :

mapper.pl reads_qseq.txt -b -h -i -j -k TCGTATGCCGTCTTCTGCTTGT -l 18 -m -s reads_collapsed.fa

-a              input file is seq.txt format
-b              input file is qseq.txt format
-c              input file is fasta format
-e              input file is fastq format
-d              input file is a config file (see miRDeep2 documentation).
                options -a, -b or -c must be given with option -d.
Preprocessing/mapping:
-g              three-letter prefix for reads (by default 'seq')
-h              parse to fasta format
-i              convert rna to dna alphabet (to map against genome)
-j              remove all entries that have a sequence that contains letters
                other than a,c,g,t,u,n,A,C,G,T,U,N
-k seq          clip 3' adapter sequence
-l int          discard reads shorter than int nts
-m              collapse reads
ADD COMMENT
0
Entering edit mode
6.1 years ago
ti&te ▴ 40

Thank you for the suggestion, but I still can't find the data how is with Phred end trimming and error rate in adaptor recognition.

ADD COMMENT

Login before adding your answer.

Traffic: 1598 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6