Hi all,
I'm trying to view my aligned sequences using samtools tview option and have questions in interpreting the result.
First, what does the first line mean? I found that it is genome coordinates and believe that N=nucleotides and *=no Ns? What kinds of symbols can exist other than N or *?
And as I know of it, the second line means reference sequence and from the thrid it should appear as "dots" or "commas" is my data match with reference sequence. But all my data appear as ATGCatgc. What should I do to make mathing sequences as dots or commas? And does upper/lower cases are related to strands?
Thanks!
$samtools tview sample1.recalrealndup.bam ucsc.hg19.fasta -p chr6:132421907
132421911 132421921 132421931 132421941 132421951 132421961 132421971 132421981 132421991 132422001
N**NNNNNN**NNNN**NNNNN***N*NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
C AAAAAA AAAA AAAAA G TTTTTATGTGCACTTAAATTTGGGGACAATTTTATGTATCTGTGTTAAGGATATGTTTA
C**AAAAAA**AAAA**AAAAAAATG*TTTTTTTGTGCAATTAAATTTGGGGACAATTTTTTTTATCTGTGTTAAGGATATTTTTA
c**AAAAAA**AAAA**AAAAA***G*ATTTTATGT ACTTAAATTTGGGGACAATTTTATGTATCTGTGTTAAGGATATGTTTA
c**AAAAAA**AAAA**AAAAA***T*TTTTTTTTTGAAATTAAATTTGGGGAAAATTTTAT gtgttaaggatatgttta
Since this thread has resurfaced... Shameless plug: If you are interested in samtools tview maybe have a look also at ASCIIGenome