What is the best way to merge multiple fasta files containing contigs?
0
0
Entering edit mode
5.8 years ago

Hello,

I have been working with some metagenomic samples and I have individually assembled them with Spades. Now I would like to merge the contigs.fasta files from my samples (I want to proceed with mapping and binning using anvi'o). Which is the most appropriate way of doing that?

Assembly Anvi’o fasta • 3.4k views
ADD COMMENT
3
Entering edit mode

cat?

ADD REPLY
3
Entering edit mode

But he might have identical names in different files. Try sed 's/>/>lname/' -i lname.fasta for each output file (lname here) to make each sequence name unique and then cat them

ADD REPLY
1
Entering edit mode

I think what OP is trying to ask for is merge-assemble the different fasta files. Is it that correct anastasia.gs17 ?

ADD REPLY
1
Entering edit mode

The real problem of merging contigs from different assemblies into a single FASTA file is the likelihood of ending up with multiple contigs that may be matching to the homologous parts of identical/very closely related population genomes. In which case read recruitment with that kind of redundancy as reference will lead to dilution of short reads and will make it impossible to reliably reconstruct genomes later.

You can either try to reassemble these contigs to have a final non-redundant list of contigs for read recruitment, or do binning using individual samples and then collapse redundancy using a tool like dRep, or a start over with a co-assembly if your experimental design and/or your system permits that.

Best wishes, Meren.

PS: I got a username when I saw cat as an answer, but I will not be able to follow the discussion any further. If you have more specific questions regarding your system and best-practices for genome-resolved metagenomics feel free to try anvi'o slack (you can use the Slack button on anvi'o web page to get an invitation).

ADD REPLY
0
Entering edit mode

If you want to merge overlapping sequences into one, try a scaffolding software, like sspace or soap-denovo. If you only want them in the same fasta file, then cat, like @ATpoint and @Asaf just said

ADD REPLY

Login before adding your answer.

Traffic: 1887 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6