Hello,
I'm trying to use samtools mpileup to generate a SNP table from individual samples aligned to a reference genome. I want to use the -l flag so that only certain positions in the reference are used to call the SNPS.
Here's what I'm trying:
samtools mpileup -g -b ~/Data/alignments/bamfilenames.txt -f /Data/alignments/reference.fa -l ~/Data/alignments/positions_to_include.txt > fish_snps.bcf
This returns "segmentation fault (core dumped)." It also segfaults if I just pass it a single bam file, rather than a list of bam files. I think that the list of positions file is the issue. I want to convert it to a BED file, to see if that helps.
So any suggestions for efficiently converting a list of positions to the BED format?
My current file looks like this
chrUn 33
chrUn 34
chrUn 35
chrUn 49
chrUn 50
..
In other words, the positions are not entirely sequential because I've excluded positions that are potential paralogs.
Thanks for any insight!
Thanks! That's a great start, but that still gives me runs of only a single base, like so...
when what I need is something like this...
I think I need some sort of conditional statement, where if $2-1 = 1, then go to the next line, but if it doesn't then put that number in the second column. Further suggestions welcome!
Jorge solution works if you want to generate bed for SNPs (this is what I understood from your question). If you need to convert list of positions into bed (as you asked 1 512) you will have to provide list of intervals for each bed line.
Where are the lengths coming from?