We are interested in a way of trimming adapters of unknown length from short reads.
The adapter is composed of a FIXED part, which will always be at the beginning of the reads, and a long sequence ADAPTER which won't be necessarily present nor complete.
All of the reads should be trimmed with only "mysequence" remaining.
We have evaluated the performance of cutadapt, and fastx but none of them seem to include an option that takes this situation into account.
Do you have any idea of the best way to approach this?
Classic adapter contamination is looks more like what JC describes - your case is not handled out of the box, I believe not even by bbduk, the swiss knife of adapter trimming. However, a pragmatic solution would be an iterative approach (be aware of the pseudo code)
for fq in fq_files_to_trim
trim fixed <fq >fq_wo_fixedpart
trim adapter <fq_wo_fixedpart >clean.fq
I'd be thrilled to know, too. To me the variable length adapter between the fixed and sequence part should pose a major challenge to out of the box adapter trimming strategies, including the one of bbduk.
Cutadapt can be used defining the minimal overlap. Also, your example is more like this, isn't?:
what about pandaseq?
Look into
bbduk.sh
and this option. A guide is available here.