Hello all. I am trying to do a paired-end analysis. Ran the 2 fastq files with hisat2:
hisat2 -p 10 -x '..index_hg19/indexed' -1 R1_001.fastq.gz -2 R2_001.fastq.gz -S hisat2output.sam
Then :
samtools view -@ 10 -bS hisat2output.sam > hisat2.bam
samtools sort -@ 12 -n hisat2.bam > testSort.bam
samtools fixmate -m -@ 12 testSort.bam testSort_fixmate.bam
then i try to remove duplicates :
samtools markdup -r -@ 12 testSort_fixmate.bam > testSort_fixmate-markdup.bam
but returns problem :
ERROR: queryname sorted, must be sorted by coordinate.
WHEN i don't use -n
option in samtools sort , samtools markdup returns problem:
ERROR: Coordinate sorted, require grouped/sorted by queryname.
I cant find where is the problem. Any help ?
I formatted your post (again) with code / quote markdown, to improve readability. Please use the formatting buttons in the future.
What does
samtools view -H testSort.bam | head -5
show?Likely not the solution to your problem, but you can make these commands a lot shorter and avoid intermediate files:
By using pipes and directly going to
samtools sort
:hisat2 -p 10 -x '..index_hg19/indexed' -1 R1_001.fastq.gz -2 R2_001.fastq.gz | samtools sort -n -o testSort.bam
This requires a reasonably recent version of samtools.
yes i do use them like you suggested. but now i am trying to figure out where the problem was. thats why i was running the commands 1 by 1