Hi Biostars,
I am trying to align my paired-end reads to my assembly with bwa mem (to be used for polishing with Racon afterwards). However due to downstream application requirements (Racon) my paired-end reads have to have the same ID, which bwa does not accept on its paired-end mode. So in order to use all my reads I treated my paired-end reads as single reads for alignment with bwa mem (by deleting the ID content after the space in the header line '@' and merging both fastq files).
Now I am wondering if this approach I took was a good call or not? If so what would be the problematic here? How significant would be the difference in the quality of the alignment treated this way vs properly as paired-end? Can this alignment cause a spurious polishing later when using Racon?
I would appreciate much to have your thoughts. Thanks!
Thank Genomax. For Racon no two reads should have the same identifier up to the first whitespace, so Racon would accept this happily. However BWA-mem would not accept it because it requires the identifiers to be the same until the first whitespace in PE. This is the discrepancy I am dealing with on using Racon for polishing with a sam file generated with BWA-mem. So to be compatible I run my PE reads with BWA-mem as single reads (though not sure how good this alignment is, maybe it's just fine), or it would be great to know how to run BWA-mem with different identifiers until the whitespace in PE mode. Thanks again.
What
addcolon=
does it it will add a standard1:N:0
after the first white space. So these reads should work for both.You could also do the following as an alternate to
addcolon=
. This will create old style Illumina read headers.This should make the reads unique without the whitespace.