Hello
I'm trying to use Bio:: DB::Sam to get list of reads covering certain position .Sam file using pileup method using this code
use strict;
use warnings;
use Bio::DB::Sam;
`samtools view -Sb $ARGV[0].sam > $ARGV[0].bam`;
`samtools sort $ARGV[0].bam $ARGV[0].sorted`;
`mv $ARGV[0].sorted.bam $ARGV[0].bam`;
`samtools index $ARGV[0].bam`;
my $sam = Bio::DB::Sam->new(
-bam => "$ARGV[0].bam",
-fasta => "$ARGV[1]",
);
my $callback = sub {
my ($seqid,$pos,$p) = @_;
if ($pos eq "1240") {
for my $pileup (@$p) {
my $b = $pileup->alignment;
print $b->qname,"\n";
}
}
};
$sam->pileup("GBA",$callback);
exit;
So this code only get names of reads which cover position 1240 in reference from $ARGV[0].sam
file. The problem is that the output list of read names is definitely not complete. I.e. pileup ignores some reads.
Trying to find the root of problem I analyzed reads which alignment starts from position 1209 (just for example) and found that only the first N of such reads assessed by pileup and the last M of reads are ignored. But after those not ignored N reads it doesn't ignore all reads starting from other positions (i.e. 1150,1048,1231 and any others). So as i understand the problem must be in some kind of sorting, but I haven't found any information about sorting should be done in manual and I actually do sort .bam file.
Also I found that using $sam->pileup("GBA:1239-1241",$callback)
instead of $sam->pileup("GBA",$callback)
makes another list of reads, which differs from the previous, but I could found any logic in these differences
So what's wrong?