After Novo assembly
1
1
Entering edit mode
8.6 years ago
lamlam ▴ 10

After assembling (novo assembler ) a sequence of Tuberculosis I found that the number of base pair is greater than that number of base pair is the reference strain Is this logical?

Assembly • 1.4k views
ADD COMMENT
0
Entering edit mode

Is this logical?

Yes.

And computational biology is a quantitative science. Please tell us exactly how much greater your assembly is and which reference genome you have used.

ADD REPLY
0
Entering edit mode

i used Mycobacterium tuberculosis H37Rv this reference has 4.4Mb and my sequence has 7.3 Mb?

ADD REPLY
1
Entering edit mode

Have you tried to compare the two to see how the assemblies are different? Use Mauve to compare.
Did you have an excess of sequence (> 100x gross coverage) that went into this assembly?

ADD REPLY
1
Entering edit mode

Yes, you should map your contigs to the refseq and then identify contigs which do NOT map. Blast these contigs to identify their origin. Furthermore, your may plot GC-content of your contigs versus coverage. Do you see more than one cluster in the scatter plot?

ADD REPLY
0
Entering edit mode
8.6 years ago
Ram 44k

There could be any number of reasons resulting in an increase in the number of bases - duplication at various levels, insertions, etc. Why is comparing base count any kind of metric?

ADD COMMENT
1
Entering edit mode

Why is comparing base count any kind of metric?

Tuberculosis genomes are extremely well conserved, much more than any other bacterial species.

ADD REPLY
1
Entering edit mode

Ah, I see. Thank you :)

ADD REPLY

Login before adding your answer.

Traffic: 2647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6