Question

Fasta to axt - KaKs Calculator

0

Entering edit mode

10.1 years ago

tlorin ▴ 370

Dear BioStars Users,

I am using KaKs Calculator to estimate selection over a whole sequence, and I need to do this for many multifasta alignments. Here is some example file:

>ref
AAAAAAA
>seq1
BBBBBBB
>ref
AAAAAAA
>seq2
CCCCCCC
>ref
AAAAAAA
>seq3
DDDDDDD

And here is what I would like to get (an AXT formatted file):

seq1
AAAAAAA
BBBBBBB


seq2
AAAAAAA
CCCCCCC


seq3
AAAAAAA
DDDDDDD

I already tried this script but it's not doing properly.. Would anyone have a pipeline (in bash, perl, R, python...) to automatize this, or have an idea of how to proceed? I can for sure do it by hand for several files, but I have too many to do it :)

Thanks a lot!

bash perl R python KaKs • 8.2k views

ADD COMMENT • link updated 3.6 years ago by Ram 45k • written 10.1 years ago by tlorin ▴ 370

2

Entering edit mode

If the fasta file is structured like your example file with no blank lines, you can do something like this with awk. Just substitute the example-file.fasta with your file.

awk '$0 ~ ">" {c=substr($0,2,length($0))} NR == 2 {ref=$0} NR % 4 == 0 {print c"\n"ref"\n"$0"\n"}' example-file.fasta

ADD REPLY • link 10.1 years ago by Cytosine ▴ 460

0

Entering edit mode

Thanks @Cytosine!!! Perfect! :)

ADD REPLY • link 10.1 years ago by tlorin ▴ 370

score 4 · Accepted Answer · 2015-04-23

4

Entering edit mode

10.1 years ago

Pierre Lindenbaum 166k

cat  file.fa | paste - - - - | awk '{print  substr($3,2)"\n"$2"\n"$4}'

ADD COMMENT • link 10.1 years ago by Pierre Lindenbaum 166k