Metagenomic assembly, removing redundant contigs.
0
0
Entering edit mode
3.2 years ago

What is the purpose of the following excerpt from paper using metagenomic assemblies:

Redundancies of sequences from the same organism within the metagenome were removed by clustering all contigs at 95% identity with CD-hit v4.6.6 (72), and only the longest contig per cluster was kept

I understand what is being done but I don't know why. I could see use if binning to reduce computation time perhaps but otherwise I am not sure? Would it able reduce annotation time or something similar?

The paper in question under the methods section "Metagenome sequencing and assembly.": https://journals.asm.org/doi/10.1128/mSphere.00165-19

metagenome assembly • 1.2k views
ADD COMMENT
0
Entering edit mode

95% similarity sounds to me like they're removing real diversity from the assembly, not "same organism redundancy"

ADD REPLY
0
Entering edit mode

This was my thought also, at a minimum you would cluster by 97% would you not?

ADD REPLY
0
Entering edit mode

I don't think any metagenome assembler (be it OLC or de Bruijn graph-based) would produce such redundancy anyway..

ADD REPLY

Login before adding your answer.

Traffic: 2352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6