Is there any efficient tool to remove substring reads or duplicate reads from NGS data set? I know that readjoiner could remove the duplicated reads, but seems not work on substring reads. Thanks.
Example: Duplicates: read1: AGTCAT read2: AGTCAT In this case, only one read will be kept.
Substring: read1: GTCA read2: AGTCAT In this case, read1 will be removed.
Thanks for your help. I have added examples in the question.