What solution to assemble genome bacterial with low quality data ?
2
0
Entering edit mode
8.5 years ago

Hi everyone I have a set data with quality is greater than 20. I want to know the best way to assemble this data ? Thanks

Assembly • 1.3k views
ADD COMMENT
2
Entering edit mode

If your input is of low quality, you can't expect that your output will be of good quality...

If you believe in Alchemy, and want to try it anyway, you could try something like MIRA or velvet (don't know what platform you have used).

ADD REPLY
1
Entering edit mode

I imagine for low quality data any overlap layout consensus based assembler is better than k-mer/De Bruijn stuff..

ADD REPLY
0
Entering edit mode

First of all information in the title and the question don't match.

Title says "low quality" data but the body of the post says "greater than 20" (which I will assume means Q20 or greater). If it is the latter then there is no problem but if it is the former then assembly may still be fine.

We don't have enough reliable information here to say anything about the final quality of assembly.

ADD REPLY
0
Entering edit mode

Sorry, My english is not good and I can't express exactly the question :D. I'm beginer. I've read some paper and I see that the most of data is Q30 or greater and my data is Q20, i think its quality is low. By way, can you tell me the thresold of quality to assemble. Thanks

ADD REPLY
0
Entering edit mode

I will assume that you have illumina data since you have not told us what kind. If it is not illumina data then the following may not work/apply.

Since this a de novo assembly you may want to trim data that is Q10 or below. If trimming does not leave (more than 10-15x raw base of sequence based on the genome size you expect) you could try doing an assembly but you be warned that the results may be poor and you may need to start over. Use SPAdes as recommended below.

If there is a related genome available @NCBI you can always try to align your data and see what you get. As long as the organisms are reasonably related you may be able to map 80%+ of your data. Hope this helps.

ADD REPLY
2
Entering edit mode
8.5 years ago

I agree with the sentiment of @b.nota, however SPADES might be good to look into

ADD COMMENT
1
Entering edit mode
8.5 years ago

If you want your assembly to be of high quality, start over and generate better data. You can't expect magic and some bio-informatical hocus pocus will solve your problem.

ADD COMMENT
0
Entering edit mode

Thanks. I would do this

ADD REPLY

Login before adding your answer.

Traffic: 3112 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6