RSEM error
0
0
Entering edit mode
2.5 years ago
glady ▴ 320

Hello,

I'm trying to use RSEM for a paired-end data to map it against human reference with STAR aligner. After, trimming the adapter sequences with cutadapt, when I try to align the reads with RSEM, with the following command;

rsem-calculate-expression -p 6 --paired-end --phred64-quals --output-genome-bam --star --star-path /home/Downloads/tools/STAR-2.5.3a/source --estimate-rspd --append-names sample1-R1_trimmed.fastq sample1-R2_trimmed.fastq ref_genome/human_ref sample1.expression

I get this error;

Warning: Read A00902:595:HC27FDRX2:2:2226:26476:24236 is ignored due to at least one of the mates' length < seed length (= 25)!

How can I solve this issue? Is this because the adapter trimming is not done correctly? Or there is some issue with the input RSEM commands I used?

Thanks!

mapping transcriptomics RSEM genome alignment • 1.9k views
ADD COMMENT
0
Entering edit mode

It's a warning, not an error. If you trimmed adapters you could also filter according to size (cutadapt has this option, other tools as well)

ADD REPLY
0
Entering edit mode

Okay, thank you for the reply. So is it okay, if I ignore and proceed with this warning?

Because another issue is the diagnostic plot, where the unaligned reads are 98%. But if you open the genome.bam file in FASTQC, you can see most of the reads are present.

ADD REPLY
0
Entering edit mode

98% unaligned is bad. The reads are there but they're not aligned so it's pretty useless.

ADD REPLY
0
Entering edit mode

How can I solve the issue? Is it because of the warning from RSEM? or the adapter trimming was not done properly?

ADD REPLY
0
Entering edit mode

Could be several things. Try to manually BLAST some unmapped reads to see where they fall (in non-coding regions, rRNA etc. another genome ...). Run fastqc after trimming to see if you have adapters left (although less likely)

ADD REPLY
0
Entering edit mode

I don't see any adapters after trimming by cutadapt.

ADD REPLY
0
Entering edit mode

The issue is the adapters and illumina indices, in the FASTQC report I can see some sequences as TruSeq Adapter index 15.

But even after trimming these sequences by cutadapt, I still find a poor mapping percentage!

What could be the problem? Should I try some different algorithm instead of cutadapt?

ADD REPLY

Login before adding your answer.

Traffic: 2361 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6