Fasta and csv file reformatting
0
0
Entering edit mode
3.8 years ago
MSRS ▴ 590

Hi, 1. My fasta looks like

>AAM01497_1 Glutamate-1-semialdehyde aminotransferase [Methanopyrus kandleri AV19]
MGYEDEFPESLELFKRAERVMPGGVSSPVRRFDPYPFYVERAEGSRLYTVDGHVLIDYCLAFGPLILGHAHPEVVEAVVERVR
>AAM04025_1 glutamate-1-semialdehyde 2,1-aminomutase [Methanosarcina acetivorans C2A]
MVSEVTLDKSRQMYEKAKTLIPGGVSSPVRAIKPYPFYTASADGSKIRDLDGNEYIDYCLAYGPAVLGHNHPVIKAAIKEQLD

I want the output like

>acc|AAM01497|Glutamate-1-semialdehyde aminotransferase [Methanopyrus kandleri AV19]
MGYEDEFPESLELFKRAERVMPGGVSSPVRRFDPYPFYVERAEGSRLYTVDGHVLIDYCLAFGPLILGHAHPEVVEAVVERVR
>acc|AAM04025|glutamate-1-semialdehyde 2,1-aminomutase [Methanosarcina acetivorans C2A]
MVSEVTLDKSRQMYEKAKTLIPGGVSSPVRAIKPYPFYTASADGSKIRDLDGNEYIDYCLAYGPAVLGHNHPVIKAAIKEQLD

Please suggest to me any sed or awk command.

  1. My CSV file looks like

    MK0280,GCA_000007185.1,AAM01497.1,430,1-430,430,COG0001,COG0001,0,570.0,1.0e-200,432,3-431 MA_0581,GCA_000007345.1,AAM04025.1,424,1-424,424,COG0001,COG0001,0,574.0,1.0e-200,432,3-426

I want the output like

MK0280,GCA_000007185.1,AAM01497,430,1-430,430,COG0001,COG0001,0,570.0,1.0e-200,432,3-431
MA_0581,GCA_000007345.1,AAM04025,424,1-424,424,COG0001,COG0001,0,574.0,1.0e-200,432,3-426

this time the change needs to just in the third column. I think sed/ awk will work here. Thank you for your assistance.

sed awk fasta csv • 1.2k views
ADD COMMENT
0
Entering edit mode

I think sed/ awk will work here.

Right, go ahead and try something.

ADD REPLY
0
Entering edit mode

Unable to figure it out. Need expertise assistance. Thank you.

ADD REPLY
0
Entering edit mode

There are many modules in python, like biopython etc which you can leverage to modify this.

ADD REPLY
0
Entering edit mode

To modify fasta,

  1. Add acc| after >

    perl -p -e 's/^>(.*)/>acc|$1/' in.fa > output.fasta

  2. Replace "_1 " to |

sed '/^>/s/_1 /|/' output.fasta > out.fasta

  1. Replace "_2 " to |

sed '/^>/s/_2 /|/' out.fasta > out2.fasta

Any better way?

ADD REPLY
3
Entering edit mode

Any better way?

You don't need any better way. Anything that gives you the output you want is good enough. You are asking a trivial question here, one that can be answered many different ways and using many different tools. Heck, you can load this into any editor or spreadsheet and do a simple search-and-replace. I don't think you should be expecting us to put together a highly optimized algorithm for an obscure problem that seems very easy to solve. It is one thing to come asking for help for a difficult problem that you have tried to solve yourself and failed. This feels like you are asking us to pick black olives out of your salad, which you can do fine on your own with some effort.

ADD REPLY
0
Entering edit mode

Thank you. As a new commer in this area, we are learning a lot from you experience guy. Sorry.

ADD REPLY
1
Entering edit mode

I don't see how a winking emoji is appropriate in the forum or in this context.

ADD REPLY

Login before adding your answer.

Traffic: 2397 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6