coverage bed error but produces output file
2
0
Entering edit mode
9.2 years ago
bioguy24 ▴ 230

The below command produces a file (674 MB).

coverageBed \
  -d \
  -sorted \
  -a /home/cmccabe/Desktop/NGS/bed/sorted_unix_5column_xgen_targets.bed \
  -b /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_newheader.bam \
  > /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_output.txt

Output

chr1    955542    955763    +    AGRN:exon.1    1    0
chr1    955542    955763    +    AGRN:exon.1    2    0
chr1    955542    955763    +    AGRN:exon.1    3    0
chr1    955542    955763    +    AGRN:exon.1    4    1

My question is even-though an output file results I get the below error:

cmccabe@DTV-A5211QLM:~/Desktop/NGS$ coverageBed -d -sorted -a /home/cmccabe/Desktop/NGS/bed/sorted_unix_5column_xgen_targets.bed -b /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902.bam > /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_counts.txt
ERROR: chromomsome sort ordering for file /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902.bam is inconsistent with other files. Record was:
chr10   68734   68755   X28LU:04418:11727   6   +

I checked the bam and it is coordinate sorted:

cmccabe@DTV-A5211QLM:~/Desktop/NGS/pool_I_090215$ samtools view -H IonXpress_008_150902_newheader.bam | grep SO
@HD    VN:1.4    GO:none    SO:coordinate

I just want to make sure that I can use the data output of if I am missing something. I can not find that record in the bam file, is that a sequencing error, a bug, or something unique to Ion Torrent bam files. Thank you :).

Sorted bed file

chr1    955542    955763    +    AGRN:exon.1
chr1    957570    957852    +    AGRN:exon.2
chr1    976034    976270    +    AGRN:exon.2;AGRN:exon.3;AGRN:exon.4

I also verified the sort order in the bam

cmccabe@DTV-A5211QLM:~/Desktop/NGS/pool_I_090215$ samtools view -H IonXpress_008_150902_newheader.bam | grep SQ | cut -f 2 | awk '{ sub(/^SN:/, ""); print;}'
chr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr20
chr21
chr22
chrX
chrY
chrM
ngs sort bedtools • 5.0k views
ADD COMMENT
2
Entering edit mode

Is it a lexical vs numerical sort thing: 1 10 .. 2 20 ..?

ADD REPLY
1
Entering edit mode

You've double-checked the order of your bed file too, especially the chromosome ordering, and the naming convention? The fact that the "mis-ordering" occurs relatively early in chromosome 10 makes me think of the formal possibility that one of your files (maybe the bed file?) is sorted chr1 chr10 chr11 .. chr2 chr20 .. instead of chr1 chr2 .. chr9 chr10..

I know you know awk, but one could awk '{print $1}' aBed.bed | sort -u to quickly check

I'm guessing your output that you showed is not comprehensive, in which case you might check the tail-end of the output file to see if it also ends on chr1, or if it covered the entire genome.

ADD REPLY
0
Entering edit mode

Looks like I just need to sort -k1,1 -k2,2n

awk '{print $1}' sorted_unix_5column_xgen_targets.bed | sort -u
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrX
chrY
ADD REPLY
4
Entering edit mode
9.2 years ago
thackl ★ 3.0k

According to the bedtools manual -sorted needs to be done with sort -k1,1 -k2,2n, which would do a lexical sort and put chr10,11.. before chr2. You BAM file however, is sorted in a "human" order...

ADD COMMENT
0
Entering edit mode

-sorted just lets bedtools know that the input files are already sorted by some consistent rule. But you're right, that the bed file (sorted by the recommended -k1,1) would give a lexical sort.

Here's a post describing how to use Unix sort to sort in "human" order: How To Sort Bed Format File

ADD REPLY
1
Entering edit mode
9.2 years ago
James Ashmore ★ 3.5k

As the error says it look as though either your BAM or BED file are not sorted correctly. First inspect the sort order of the chromosomes in both the sorted BED and BAM files by running the following commands:

cut -f1 your_file.bed | uniq
samtools view your_file.bam | cut -f3 | uniq

If the order of the chromosomes is different then you will have to provide a genome file to Bedtools with the order your chromosomes are in the BED file.

ADD COMMENT
0
Entering edit mode

Thank you everybody :).

ADD REPLY
0
Entering edit mode

I mis-read the post and my bed was sorted correctly but my bam is in "human" sorting so I need to re-sort the bam in "lexical"? Thank you :).

The output file that results is (with the chromosome ordering):

cmccabe@DTV-A5211QLM:~/Desktop/NGS/bed$ awk '{print $1}' IonXpress_008_150902_counts.txt | sort -u
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
ADD REPLY
0
Entering edit mode

Yep, a simple samtools sort your.bam your.sorted should suffice.

ADD REPLY
1
Entering edit mode

Beware samtools sort does not sort your alignments by chromosome, it sorts them by position in each chromosome. For instance, if your FASTA index which you aligned your reads to starts with chr12, chr13, chr15 e.t.c then your BAM file with start with chr12, chr13, chr15 e.t.c. Since it is much harder to sort the BAM file then the BED file, I suggest you sort your BED file by the order of the chromosomes in the FASTA index file.

ADD REPLY

Login before adding your answer.

Traffic: 1742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6