Dear all,
I have pair-end (PE) sequencing data (R1.fq and R2.fq). First, I removed the adapters, which resulted in unequal sequence length for some PE reads. Then I removed the reads with length < 18-nt, which resulted in different number of reads for the R1.fq and R2.fq data. Do you think it OK to continue tophat2 mapping? (I am trying this, and it seems OK)
I appreciate any of your comments. Thanks!
For normal fragment libraries, adapter-trimming read pairs should leave them the same length. It's important to process paired reads together with a pair-aware program to prevent these issues of differential length after adapter-trimming and mismatched reads.
Thank you very much, Brian. So, do you think one way is to do separate mapping for R1 and R2? Thanks!
Yes, that is one way to do it, but the reason to run paired reads is because the mapping is more accurate, so you'll end up with inferior results. Also, you'll end up with untrimmed adapters by not doing adapter-trimming together. I recommend deleting your trimmed files and following this approach, starting from the raw fastqs. Specifically:
...where adapters.fa is in bbmap/ref/
Thanks a lot Brian. It's very helpful. I will try that.