merge two multifasta files
1
0
Entering edit mode
5.8 years ago
erick_rc93 ▴ 30

I have two multifasta files that have almost the same headers, for example

file1.fasta
    >header_1
    dnasequenceoffastafile1
    >header_2
    dnasequenceoffastafile1
    >header_3
    dnasequenceoffastafile1

  file2.fasta
        >header_1_f2
        dnasequencefastafile2
        >header_2_f2
        dnasequencefastafile2
        >header_4_f2
        dnasequencefastafile2

and I would like the next output

    merged.fasta 
        >header_1_header_1_f2
        dnasequenceoffastafile1dnasequencefastafile2
        >header_2_header_2_f2
        dnasequenceoffastafile1dnasequencefastafile2
sequence • 3.1k views
ADD COMMENT
1
Entering edit mode

Two solutions at Combining two fasta sequences into one , do any of them work for you?

ADD REPLY
0
Entering edit mode

And why would you do that ?

ADD REPLY
1
Entering edit mode
5.8 years ago
Brice Sarver ★ 3.8k

Here's a quick R solution using Bioconductor. There are analogous examples using a variety of different tools and languages (see Biopython, etc.). For quick manipulations, I like to use Biostrings due to how efficiently it handles long strings once in memory.

library(Biostrings)
a <- readDNAStringSet("file1", format="fasta")
b <- readDNAStringSet("file2", format="fasta")
d <- DNAStringSet(paste0(a, b))
# reassign names
names(d) <- names(a)
writeXStringSet(d, "your_fila_name.fa")

Hope this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 1749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6