Concatenate two fasta file having same header name with different sequence
2
0
Entering edit mode
23 months ago
skdv2522 • 0

Hey everyone , i have two fasta files having same header with different sequence content. i have to merge both files. i want to write a script in perl or bioperl . file1:

sang123 ATGCGTA

file2:

sang123 ATTTGGCCC

FIXED STRING = 10 N between both the sequence

expected result

sang123 ATGCGTANNNNNNNNNNNNNNNNNNNATTGGCCC

Need help as i am trying to resolve a problem from past 1 week but couldn't found any solution. please help me .

fasta • 809 views
ADD COMMENT
0
Entering edit mode
23 months ago
5heikki 11k

Assuming that 1) the files are sorted; 2) there are no pairless headers; 3) there are no linebreaks in sequences:

cat file1.fa
>sang123
ATGCGTA
>another
ABCD

cat file2.fa
>sang123
ATTTGGCCC
>another
EFGH

paste -d $'\t' <(paste -d $'\t' - - <file1.fa) <(paste -d $'\t' - - <file2.fa) | awk 'BEGIN{FS="\t";OFS="\n"}{print $1,$2"NNNNNNNNNN"$4}'
>sang123
ATGCGTANNNNNNNNNNATTTGGCCC
>another
ABCDNNNNNNNNNNEFGH
ADD COMMENT
0
Entering edit mode
23 months ago
skdv2522 • 0

thank you. but i need a perl script for my query . if you can i really appreciate.

ADD COMMENT

Login before adding your answer.

Traffic: 2155 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6