Can Different Read Length Samples Be Used For Expression Analysis?
1
0
Entering edit mode
12.1 years ago
pinkiii1984v ▴ 20

Hi,

I am working on the RNA-seq (Hiseq PE) data and my focus currently is expression analysis (gene and isoforms). Few of the samples have poor quality bases at the end of the reads and other samples are fine. My question is if I trim the poor quality bases (say like 10 bases), should I apply trimming on all the samples or is it alright to apply it only on the samples which have these poor quality bases. And can these samples with different read lengths be used for expression analysis?

Thank you

rna-seq gene expression data • 3.0k views
ADD COMMENT
0
Entering edit mode

Why don't you trim the reads from all your samples based on the corresponding quality scores. Reads that have good quality will not suffer from trimming. For example try trimmomatic, download binary here, source here

ADD REPLY
1
Entering edit mode
12.1 years ago

It is alright to apply read trimming thing only for those samples whose reads have low quality bases in the end. Samples with different read lengths can be used for expression analysis as in the end you will be using RPKM normalised values that will normalize the read counts for genes by dividing the number of reads mapped by the length of the coding region and number of total reads mapped for that sample. This way even if trimming the low quality reads have narrowed down the number of reads mapped (it may happen that trimming made the read really small so that it can't be confidently aligned against the genome and discarded. Though it will rarely happen as you have long read Hiseq PE data) , the RPKM will take care of it.

ADD COMMENT

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6