Hi, everyone! I have a question that how to call a consensus sequence that could fill gaps located in the reference. Whether samtools could do this? We know that PacBio long reads could use to scaffold the contigs. However, it left the gaps with NNNNN. As the PacBio long reads do not have NNNN on it. Why do they use NNNN to replace the original seuqence?
The case is like this:
contig: contig1 contig2 contig3
AAAAAAAAAA TTTTTTTTTTT CCCCCCCCCCCC
Subreads:
GGGAAAAAAAAAAGGGGGGGGGGGGGTTTTTTTTTTTGGGGGGGGGGGGGGGGCCCCCCCCCCCCGGG
Scaffold:
AAAAAAAAAANNNNNNNNNNNNNTTTTTTTTTTTNNNNNNNNNNNNNNNNCCCCCCCCCCCC
So, I am wonder by using samtool or other tools, is there can call a consensus sequence is like this:
Consensus:
GGGAAAAAAAAAAGGGGGGGGGGGGGGTTTTTTTTTTTGGGGGGGGGGGGGGGCCCCCCCCCCCCGGG
how to call this consensus? Thanks!
I modified this to a question because it is a question. No need to convert it back to a "Tool" which is only to be used for announcing new tools. See also How to Use Biostars, Part II: Post types, Deleting, (Un)Subscribing, Linking and Bookmarking
@WouterDeCoster Thanks for your modification!