Dear all, I have one problem, I would like to count the content of GC nucleotids from bam file for each read. I think I have to count it from $10, but I do not know how? Could you help me please? Thank you
Dear all, I have one problem, I would like to count the content of GC nucleotids from bam file for each read. I think I have to count it from $10, but I do not know how? Could you help me please? Thank you
awk '{ n=length($10); print $10, gsub(/[GCCgcs]/,"",$10)/n;}' your.sam
samtools view [bam] | perl -lane '$N =()= $F[9] =~ /GC/gi; print $N;'
The odd looking =()= operator instantly converts the match from array to scalar context.
Perl one liner 'a' splits by tabs, 'l' prints a new line, 'n' operates on all lines and 'e' indicates expression.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I am not sure if it is good, because every time I receive the error
awk: (FILENAME=filip.sam FNR=1) fatal: division by zero attempted
, than there is the problem with "n" or I do not know.I tested my awk script without the sam header.
Oh ok, because I use to convert
samtools view -h
(with header). Without header is working, than thank you so much for your help.