Hi ,
I am a new bioperl user. I made a script to count the nucleotide composition in each column of the alignment. I am able to get the count for the nucleotides, but slice doesn't allow the count of gaps as it seems to exclude them. I would greatly appreciate any suggestions to solve this.
#!/bin/perl -w
use strict;
use warnings;
use List::Util 'max';
use Bio::SimpleAlign;
use Bio::Align::AlignI;
use Bio::AlignIO;
use Bio::SeqIO;
my $in= Bio::AlignIO->new( -file => "seq.fst", -format => "fasta");
my $align = $in->next_aln();
print "column\tA's\tT'st\C's\tG's\n";
for (my $i = 1; $i <= $align->length; $i++) {
my %count;
my $seqs = $align->slice($i,$i);
my $gap_char = $seqs->gap_char();
my $count_A=0;
my $count_C=0;
my $count_T=0;
my $count_G=0;
my $count_N=0;
my $count_gap=0;
foreach my $seq ($seqs->each_seq) {
my $col=$seq->seq;
if ($col eq 'A') {
$count_A++;
} elsif ($col eq 'C'){
$count_C++;
} elsif ($col eq 'T'){
$count_T++;
} elsif ($col eq 'G'){
$count_G++;
} elsif ( $col eq 'N'){
$count_N++;
} elsif ($col =~ m/^\Q$gap_char\E$/){
$count_gap++;
}
$count{$seq->seq} += 1;
}
print"$i\t$count_A\t$count_T\t$count_C\t$count_G\n";
}
I get a warning like this: MSG: Slice [5-5] of sequence [3/1-6] contains no residues. Sequence excluded from the new alignment". and doesn't seem to count the gaps in the column.
Do you mean that the gap counting itself is not working or the result is not being printed out? You don't appear to be printing out the gap count anywhere in your code, so if it's not being printed out I would imagine that's the problem.