Edit fasta header for TSA submission
0
0
Entering edit mode
13 months ago
sumitra.20 • 0

Hi everyone,

I've been trying to edit the headers of my fasta file which is intend to upload on NCBI TSA. Can't seem to successfully upload my file on TSA and if im not mistaken it could be because of the header format. The headers of my fasta file are as below:

>TRINITY_DN1078649_c1_g1_i1 len=235 path=[0:0-234]

>TRINITY_DN1078643_c0_g1_i1 len=204 path=[0:0-203]

and i would like to change them to the format below as suggested by the guideline:

>TRINITY_DN1078696_c0_g1_i1 [modifier=soil metatranscriptome] [moltype=mRNA] [tech=TSA]

I tried using the command below but failed :

sed ‘s/len.*$/[modifier=soil metatranscriptome] [moltype=mRNA] [tech=TSA]/g' trinity_OUT.Trinity.fsa' > trinity_OUT2.Trinity.fsa

I'm new to coding and i can seem to understand where im going wrong. Any help will be appreciated.

Thank you

NCBI fasta unix TSA RNA • 634 views
ADD COMMENT
1
Entering edit mode

Have a look at SEDA (https://www.sing-group.org/seda/). Under the rename header operation (https://www.sing-group.org/seda/manual/operations.html#rename-header) you have several options to achieve what you want: first keep only the fields you need (using multipart header) and then add the suffix you want (using add prefix/suffix).

ADD REPLY
0
Entering edit mode

Something is off regarding quotes: right after sed you have and it is closing with ' after /g. They should have been the same character. Moreover, you would not need the extra ' after trinity_OUT.Trinity.fsa. And copy-pasting the error message would be helpful for understanding the nature of the failure.

ADD REPLY

Login before adding your answer.

Traffic: 1853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6