I'm working on a project where the read data consists of 6-mers ligated to the 5' end of a sequence. These 6-mers are important, but must be excluded during the alignment of the reads to which they are attached.
My current method of evaluating these experimental data consists of trimming off the 6-mers, aligning the trimmed reads, and then using the read query-name to go back and retrieve the 6-mer from the original FASTQ data.
This method currently works just fine. However, it would be far more efficient if I could use an aligner with the ability to soft-clip (not trim!) the reads, such that the clipped bases from each read aren't considered during the alignment, but will show up in the BAM data with the appropriate CIGAR soft-clipping designation.
Does anyone know if there's any aligner out there which supports this option? I've searched through several (bwa, bowtie, SOAP), but unless I'm misunderstanding the documentation, none of them support what I'm trying to do.
Very elegant solution. Thank you!