Hi,
I have a BAM file which has chr id's as NC_00000*.
I did sorting using samtools sort function.
I wanted to remove duplicates, so I'm using MarkDuplicates.jar from Picard tools to get the job done. But it gives me the following error:
Exception in thread "main" net.sf.picard.PicardException: 13_0501.sorted.bam is not coordinate sorted.
at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:273)
at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:117)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:158)
at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:101)
But I think my bam file is sorted. This is the header of my bam file.
@HD VN:1.0 SO:unsorted
@SQ SN:NC_000001 LN:249250621
@SQ SN:NC_000002 LN:243199373
@SQ SN:NC_000003 LN:198022430
@SQ SN:NC_000004 LN:191154276
@SQ SN:NC_000005 LN:180915260
@SQ SN:NC_000006 LN:171115067
@SQ SN:NC_000007 LN:159138663
@SQ SN:NC_000008 LN:146364022
@SQ SN:NC_000009 LN:141213431
@SQ SN:NC_000010 LN:135534747
@SQ SN:NC_000011 LN:135006516
@SQ SN:NC_000012 LN:133851895
@SQ SN:NC_000013 LN:115169878
@SQ SN:NC_000014 LN:107349540
@SQ SN:NC_000015 LN:102531392
@SQ SN:NC_000016 LN:90354753
@SQ SN:NC_000017 LN:81195210
@SQ SN:NC_000018 LN:78077248
@SQ SN:NC_000019 LN:59128983
@SQ SN:NC_000020 LN:63025520
@SQ SN:NC_000021 LN:48129895
@SQ SN:NC_000022 LN:51304566
@SQ SN:NC_000023 LN:155270560
@SQ SN:NC_000024 LN:59373566
@PG ID:0 PN:clcgenomicswb VN:7.0
The header seems to suggest that it's unsorted. That's what is bothering me.
Did you sort by read names rather than chromosomal coordinates in samtools (the
-n
flag)? If so, this is the problem.No, I did not do that. I checked the lines manually in bam file. They all seem to be sorted by coordinates not read names.
hello,Today,I meet the same question,which is very trouble.Did you give me some tips!thanks
I did get it fixed. What kind of error do you get?