How to interpret N's in contigs
0
1
Entering edit mode
8.4 years ago
Redmar ▴ 20

I'm using abyss to assembly pair-end data, and then blast to pick out a set of genes I'm interested in. Sometimes, a contig contains a stretch of N characters, and I'm not sure how to interpret those.

gene TTTGC----------------CGGTGC
midl |||||                ||||||
cont TTTGCCGGTNNNNNNNNNNNNCGGTGC

Blast counts this as 16 gaps, since it has to insert 16 gaps to overlap the 4 basepairs and 12 N's from the contig. How certain is abyss that there should be 12 N's there, and how is this determined? Based on the blast result, and similar samples I ran this on, I would say those 12 N to "-" mismatches are wrong. But if abyss is certain there should be 12 N's, then I don't want to discount them.

abyss assembly • 1.8k views
ADD COMMENT
1
Entering edit mode

Is it a contig or a scaffold?

ADD REPLY
1
Entering edit mode

It is a contig, from the contigs.fa output file from abyss

ADD REPLY

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6