Entering edit mode
7.7 years ago
victoria_aleks
▴
60
Dear all, I am running the command: bedtools intersect -sorted -g [GENOME_FILE] -abam [BAM_FILE] -b [BED_FILE] > [OUTPUT_FILE] - and am getting an error about wrong sorting. Apparently my input BAM files have strange sorting, that goes: chr1 chr2 ... Chr9 X Y Chr11 chr12 ...
While BED and GENOME files are: chr1 chr2 ... chr22 X Y
As I understand it is easier to "re-sort" BED file than BAM. Still, I have troubles with doing this :) Could any one advice on this subject? Thank you!
How did you sort the bam file?
I did not. I have sorted bed and genome files according to bam (as I thought). only later I found out that bam file has this strange sorting with XY chromosomes, and, honestly, i dont know how to sort bam file :)
This is what you're looking for if you don't know how to sort a bam file: http://www.htslib.org/doc/samtools.html
I still dont understand how i can reshape the order of the chromosomes, so that X and Y will be not between chr9 and chr10, but at the beginning/or at the end...
Depending in your error, you may need to sort the bam file. You will need to add more info to the question in case you want some help
the error is that genome file and bed file have sorting "1 2 3 4 5 6 7 8 9 10 11... x y" and bam file has " 1 2 3 4 5 6 7 8 9 x y 11 12..."
You need to sort your bed_file and your genome_file appropriately. Bedtools has a 'sort' function that's rather slow, but you can do this to fix the problem:
I actually sorted the bed file in this manner, but it puts the XY at the end, while my bam file has XY chromosomes in between single-number chromosomes and double-number chromosomes (i.e. between chr9 and chr10)
So you did LC_COLLATE?
yes, and XY chromosomes a placed at the end :)