HI,
I did Fastqc and found that a potential 50bp illumina single End PCR primer 1 sequence in my reads as followings
AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA (100% over 30bp)
I checked my reads and found that this 50bp sequence locates on 5' of my reads that account 0.25% of all reads. (also some of my reads that there are GCGCA/GCTCAG/AACCG/AACAAAAGG sequence before this 50bp sequence too))
Since my reads are all 88bp length. I do not want to keep these reads even if I cut these 50bp sequence off.
Anyone know if there is any tools that can delete the reads out when they have this 50bp sequence in the read? Or anyone has scripts or other ways to do this?
EDIT:
My aim for above question is that I want to get rid of these reads which contain AGTTGATCCGGTCCTAGGCAGTGTAGATCTCGGTGGTCGCCGTATCATTA sequence. since the reads contained this 50bp sequence only account for 0.25%. Fastq toolkit trimmer or other tools can not help.
I need to remove these reads which contain this 50bp sequence noisy from my library before I map them with BWA
please edit your original question rather than adding some 'answers'.
cross posted on SE: http://seqanswers.com/forums/showthread.php?t=33414