Hi all!
I'm using twoBitToFa to retrieve a specific region of a chromosome in hg19. The command I'm using is:
twoBitToFa http://hgdownload.cse.ucsc.edu/gbdb/hg19/hg19.2bit myfile.fa -seq=chr1 -start=891021 -end=125030866
I would expect there to be 125030866 - 891021 = 124139845 nucleotides. However, when I count the number of nucleotides with a bash command there seem to be 60 more positions.
So, my question is: how can I find those extra positions and delete them? Or are they supposed to be there and I'm doing something wrong/ not having something into account??
Thanks in advance!
Are there any N's?
Thanks to both of you, you were both right!! It turns out that my
bash command
was counting both theN's
and the header line of the sequence!!