I mapped reads with
bwa mem -M -t 40 allCombinedFinalSet.fa Seq.R1.fastq Seq.R2.fastq > aln.sam
Extracted the mapped reads
samtools view -f 0x2 -b aln.bam > output.bam
Extracted the fastq
bamToFastq -i output.bam -fq R1.fq -fq2 R2.fq
grep @HISEQ578:1035:HJ2KCBCXX:1:1104:14672:39678/1 R1.fq []
@HISEQ578:1035:HJ2KCBCXX:1:1104:14672:39678/1
@HISEQ578:1035:HJ2KCBCXX:1:1104:14672:39678/1
@HISEQ578:1035:HJ2KCBCXX:1:1104:14672:39678/1
I notice it has duplicated ....
I think this because read was mapped twice (i.e. BWAmem).
I tried fastuniq but it does not remove the duplicated reads.
Can you please help me to remove duplicated reads from fastq files.
What exactly are you trying to do?
I am trying to mapped the allReads against "filtered" genome and extract mapped fastq for re-assembly.