RSEM ignores quality scores?
1
0
Entering edit mode
7.3 years ago
AW ▴ 350

Dear all,

I want to quantify gene expression for my Trinity transcriptome assembly using RSEM. I am getting confusing results depending on how I specify quality scores of the Illumina paired end reads. They are Illumina 1.5 and have phred 64 quality scores.

The RSEM manual says that phred+33 is default:

--phred33-quals Input quality scores are encoded as Phred+33. (Default: on)

--phred64-quals Input quality scores are encoded as Phred+64 (default for GA Pipeline ver. >= 1.3). (Default: off)

a. When I specify the following commands I get the following alignment stats

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity.fasta Trinity --phred64-quals --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453455 (17.60%) aligned concordantly 0 times 3684847 (18.78%) aligned concordantly exactly 1 time 12480230 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

bowtie2 -q --phred64 --sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 1000 --no-mixed --no-discordant -p 10 -k 200 -x Trinity. fasta -1 1.out_paired.fastq -2 2.out_paired.fastq | samtools view -S -b -o temp/ bam -

b. However, I get exactly the same stats when I don’t specify phred 64. How can this be if RSEM defaults to phred33?

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity. fasta Trinity --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453456 (17.60%) aligned concordantly 0 times 3684845 (18.78%) aligned concordantly exactly 1 time 12480231 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

I checked in the RSEM standard output and it is specifying phred33. So phred33 is default but how can it give the same alignment?

bowtie2 -q --phred33 --sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1 -I 1 -X 1000 --no-mixed --no-discordant -p 10 -k 200 -x Trinity. fasta -1 1.out_paired.fastq -2 2.out_paired.fastq | samtools view -S -b -o temp/ bam -

c. When I directly specify phred33 I also get the same alignment stats

rsem-calculate-expression --paired-end 1.out_paired.fastq 2.out_paired.fastq Trinity.fasta Trinity --phred33-quals --bowtie2 -p 32

19618532 reads; of these: 19618532 (100.00%) were paired; of these: 3453456 (17.60%) aligned concordantly 0 times 3684845 (18.78%) aligned concordantly exactly 1 time 12480231 (63.61%) aligned concordantly >1 times 82.40% overall alignment rate

Any help would be much appreciated.

A

RSEM phred • 2.3k views
ADD COMMENT
1
Entering edit mode
7.3 years ago
AW ▴ 350

Response from Bo Li

Hi Alison,

RSEM does not ignore quality scores. It uses the quality scores to help it allocate multi-mapping reads. But when RSEM calls Bowtie2, we set the Bowtie2 parameters in a way that ignores quality scores. In particular, we set '--mp 1,1', which implies a mismatch penalty of 1 regardless of the quality score.

Hope it helps, Bo

https://groups.google.com/forum/#!topic/rsem-users/21DFXodwWzA

ADD COMMENT

Login before adding your answer.

Traffic: 2647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6