Hi Community,
A complete newbie to this all and I am not able to find a solution to my problem.
I have PE NGS as R1 and R2 reads that I merged with BBmerge.sh and extracted all reads that match a pattern using seqkit grep --by-seq --ignore-case --use-regexp --pattern XXXX
The pattern is in the centre of the read [variable 35bps]-[constant XXX]-[variable 40bps].
Now I want to remove the constant region and fuse the two variables:
before: [variable 35bps]-[constant XXX]-[variable 40bps]
after: [variable 35bps]-[variable 40bps]
I have tried trimming tools but as far as I can tell remove everything up and including the adaptor which in my case is the constant region.
Any tool out there that can be suggested ?
Thank you for the help!
You mean seqkit replace? which could be used to remove the constant subsequence: