Hello! I am new to RNA seq. I am trying to assemble a transcriptome, de novo. I have 100 bp paired end reads. I am having a hard time finding a good guide to paired end de novo assembly; I have been taking bits and pieces from different guides, however, I am starting to become confused. I would like to explain the steps I have done so far, and any advice on things I have done wrong, or need to do, would be greatly appreciated!
I used fastx toolkits quality trimmer to trim any reads with a phred score of <30.
I know I am supposed to clip adapter sequences, however, I do not think I have any on my reads? is this possible? is there a way to check?
General question: I have paired end reads... am I supposed to quality trim them separately? By separately, I mean quality trim the left reads and then quality trim the right? No matter how much reading I do on PE reads, I am still confused on this.
I used a left read and its corresponding right read to assemble a transcript using trinity. Does anyone know how to tell if I needed to set a strand specific parameter?... such as RF, FR.... and how I figure this out if need be?
I know this is many MANY questions. Any help would be so so so much appreciated!
Thank you,
Nikelle
Appears that you have chosen to ignore the advice from a previous question that you had posted here: Generating Read Length?
You should be scanning/trimming your reads in pairs so as not to lose their order in respective files.
I am not sure what you mean by "in pairs". Would I need to combine each left and right read file together somehow and then trim?
Paired-end data aware trimming programs (trimmomatic, cutadapt and bbduk) will accept a file pair for a sample (R1/R2) and then process the files together.
If a read gets trimmed and becomes a candidate for elimination based on criteria that you set (e.g. less than a certain length) then the program will remove the corresponding read from the OTHER sequence file (even though that may still meet passing criteria). This helps keep the reads in the two files in sync. Most aligners expect to find the pairs of reads for a fragment at corresponding positions in the two files. If they are not in sync they may still be used in alignment and my be reported as aligning discordantly or not aligning at all.
Thank you very much, this was informative.