Dear Community, :)
I am using Trimmomatic for quality trimming and adapter removal from my RNA reads, before I continue with mapping (via GSNAP). Unfortunately, the transcript of the sequence I´m overexpressing is so highly overrepresented, that reads of it map to my reference; even though I included my overexpressed plasmid sequence fasta to my reference fasta to map it simultaneously. My idea was to remove the reads that map to my overexpression plasmid, already before mapping. Is it possible to do so via Trimmomatic? I thought it should be possible, as Trimmomatic cuts adapter sequence as well. How would I have to adjust the command? My plan B would be mapping to the plasmid sequence first and using the unmapped read output file for mapping to the actual reference. But I doubt, that this might cause mismapping of other transcripts to my overexpression plasmid sequence as well... Plan C is finding an option to trim the OE-sequence within the mapping process of GSNAP. I know that it has some options of quality filtering as well. How would you proceed?
Thanks a lot for any kind of suggestion or help. :) :)
Have a nice weekend,
Ella
Little addtition: I found the post removing overrepresented sequences from rna-seq. Here it was stated, that including overrepresented sequences to the Trimmomatic Adapter Fasta file would only work for sequences on 5' end of the reads. As in my case the complete read would hit the overexpression plasmid fasta, is that a problem? And is it a problem, that the fasta of the overexpression plasmid is way larger than read length?
Thanks again :)