Hello! I am new to RNA seq. I quality trimmed my fastq sequences via fastX toolkit using a phred score of 30. I would like to figure out the read length after the quality trim (phred score < 30 was removed). Any help would be much appreciated.
This kind of trimming is very severe and likely unnecessary. I assume you are aligning to a reference genome. You are probably throwing away lot of good data if you did lose a lot if bases after trimming.
You might want to consult www.rnaseq.wiki for a really nice description of the steps involved in processing RNAseq. IIRC, they cover read trimming in the appropriate section
Thank you very much! One other question. Before trim, I had sequence length = 100. After trimming, I had a sequence length of 3-100. Can you make any sense of this? I think I am having a hard time figuring out what length it is measuring.
That means some reads were trimmed to a length of 3 eliminating 97 bases and you have a range of read lengths remaining that goes from 3 to 100. See my comment for your original question.
Chris, the fourth line of FASTQ is the quality score, not sequence (but it should be trimmed to the same length as the sequence string, so results should be the same).
This kind of trimming is very severe and likely unnecessary. I assume you are aligning to a reference genome. You are probably throwing away lot of good data if you did lose a lot if bases after trimming.
See this thread for a paper referenced in there about trimming using quality: Which Phred value to use in trimming
You might want to consult www.rnaseq.wiki for a really nice description of the steps involved in processing RNAseq. IIRC, they cover read trimming in the appropriate section