Hi everyone,
I have some Illumina MiSeq data (sequenced beginning 2015). The reads are paired-end 2x250 bp. I need to cut 100 bp of the end of each read. Therefore I used fastx-toolkit trimmer using the following code
fastx_trimmer -t 100 -Q33 -i file.fastq -o file_trimmed.fastq
Although the files were sequenced quite recent, I needed to use the Q33 parameter, otherwise fastx_trimmer
was not working.
Now I want to map these data using bwa-mem. I get however the following error message:
[mem_sam_pe] paired reads have different names: "M01441:126:000000000-ADL98:1:1104:10392:3758", "M01441:126:000000000-ADL98:1:1104:23060:3763"
So bwa-mem ended with an error. Does anyone know why this is happening and how to solve it?
I have seqtk installed. Would this work to trim the last 100 bases?
You're saying to process both reads simultaneously ... but how can I do that in seqtk?
I have used the above command on both R1 and R2 .fastq files seperately. BWA-mem now finishes successfully. However GATK is now ending with an error:
Ahh! Sorry, my mistake. I could have sworn that seqtk processed files as pairs, but according to the manual, I don't see any option for that. You can do it with BBDuk, though:
By default, BBDuk will always leave a minimum of 1bp in a read to prevent the above problem. However, you can use the flag
minlen=30
to, for example, throw away all read pairs that end up shorter than 30bp after trimming.And can you also use it in the following situation:
I want to remove the first 4 bases & then keep the next 120?
Yes, in that case:
ftl=X
means trim the leftmost X bases;ftr2=X
means trim the rightmost X bases; andftr=X
means trim all the rightmost bases after position X, 0-based.