Taxonomy of bins is not correct
1
0
Entering edit mode
2.8 years ago
salmon ▴ 10

Hello,

I have the dna sequence of the pure culture. I processed it using sickle and did denovo assembly using megahit. Then performed BBmap and metabat. I obtained 3 bins. I checked the contamination for these bins using checkm (result shown below).

Bin Id        Marker lineage                      # genomes   # markers   # marker sets    0     1     2   3   4   5+   Completeness   Contamination   Strain heterogeneity  
bins_1   g__Pseudomonas (UID4550)           78         1044          368         12   1025   7   0   0   0       98.46            0.65              14.29          
bins_2     k__Bacteria (UID203)                 5449        102            57         91    11    0   0   0   0       14.39            0.00               0.00          
bins_3     k__Bacteria (UID203)                  5449        103            58        101    2     0   0   0   0        3.45            0.00               0.00          

When I checked the lineage of bins, I am scared one of bin showed Gammaproteobacteria (however, my strain belongs to betaproteobacteria according to NCBI

Also, I am not sure why other two bins are showing for planktothrix.

bins_1              43                   0         k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas  
bins_2              4                    0         k__Bacteria;p__Cyanobacteria;c__Oscillatoriales;o__Oscillatoriales;f__Planktothrix;g__Planktothrix          
bins_3              1                    0         k__Bacteria;p__Cyanobacteria;c__Oscillatoriales;o__Oscillatoriales;f__Planktothrix;g__Planktothrix          

Any suggestions will be very helpful. Thank you so much

bins • 863 views
ADD COMMENT
1
Entering edit mode
2.8 years ago
Mensur Dlakic ★ 28k

First, I think you need to make an effort and properly format your post if you are expecting us to look at it seriously. Look at your post and think: could you figure out what is what in it if you didn't already know how everything looks formatted? As you are writing the post, there is a preview underneath, and you can see how the post will look like once you push the "Save" button. I think it will help if you edit the post, select the parts where spacing needs to be formatted, and click on the edit button that says 101_newline_010, which will format it correctly and show it as such in a preview. Or you can select the text and hit CTRL+K. This is how the part of your post looks like when properly formatted:

bins_1 43 0 p__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas
bins_2 4 0 k__Bacteria;p__Cyanobacteria;c__Oscillatoriales;o__Oscillatoriales;f__Planktothrix;g__Planktothrix
bins_3 1 0 k__Bacteria;p__Cyanobacteria;c__Oscillatoriales;o__Oscillatoriales;f__Planktothrix;g__Planktothrix

I seem to recall that you were using metabat2 for binning, and the problem with that program is that it gives no visual output of its clusters. You may want to try another program such as VizBin that will show the clusters to inspect visually. If you truly have the contamination, it still may not be what CheckM says it is, because it is not very reliable in determining taxonomy at all levels. There are some taxonomic groups where it will do a good job, but for some of them the assignments will be incorrect. Instead, you may want to go with GTDB-toolkit which is specifically meant for taxonomic classification. This is all to say that you may have a plasmid, or something like that, and BLASTing it against the nt database may give you an answer. If the contaminations is real, you will have to figure out on your own how it happened.

Even assuming that you have contamination, your main genome seems to be >98% complete. As long as it is cleanly separated from the other two bins, you can simply set them aside and keep using only the main bin.

ADD COMMENT
0
Entering edit mode

Thank you for pointing out. I have updated my post. I hope it is okay now. I will try to use VizBin and GTDB tool kit. Thank you for your advice.

ADD REPLY

Login before adding your answer.

Traffic: 2124 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6