Question

Change Fasta File Header

0

Entering edit mode

12.8 years ago

larriba.ed • 0

Hi,

I'm working on the annotation of a fungal genome. The problem you raise is the following: I have a fasta file with the names of all genes, genes in the file are called according to prediction program that I have used, with the following format:

>snap_masked-scaffold983-abinit-gene-2.78-mRNA-1
MATPSPLMMLLGALFFFSANVFAAGAVLGVDLGTEYIKAALVKPGIPLEIVLTKDSRRKETSAVAFKPSKSGPTAGQFPERSYGADAMALAARFPGEVYPNLKTLLGLPIDDASVKEYAARHPALQLQAHSSRGTPAFKTKTLTAEEDALLVEELLAMELQSVQKNAEAAAGDGSSV
>snap_masked-scaffold889-abinit-gene-3.50-mRNA-1
MSSLFDSWFGFIFWGVAYFRMRTADKKIGRERNVIGDWFSMGLNVIIILTGFFFLTAGTYASVQGIIDSFNAGEVGGVFSCKSNGV
>snap_masked-scaffold889-abinit-gene-3.43-mRNA-1
MAVYRVPFSWVHFVNLTIQL

Also I have a file with the correct names, the name of the sequences in the two fasta files is different. My question is if there is a script for changing the names of the first list using the second basis. The idea would be to change the name of all head supports. Thank you very much.

fasta • 5.0k views

ADD COMMENT • link updated 12.8 years ago by Ketil 4.2k • written 12.8 years ago by larriba.ed • 0

5

Entering edit mode

I don't understand. Can you show us the content of "Also I have a file with the correct names, the name of the sequence"

ADD REPLY • link 12.8 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Is the order of the gene in file one identical to the order of the names in file 2?

ADD REPLY • link 12.8 years ago by Whetting ★ 1.6k

0

Entering edit mode

Usually, gene prediction programs are able to make several predictions/seq (like your scaffold889 I guess) or no prediction at all. As a result, you won't have the same number of seqs in your original fasta file and your predicted peptides file. How can you deal with that? A gene predictor should mention the name of the ref seq (and I think it's the case here). So if you want to change names, you should do it before your prediction.

ADD REPLY • link 12.8 years ago by Manu Prestat 4.1k

score 2 · Answer 1 · 2012-09-28

2

Entering edit mode

12.8 years ago

Ketil 4.2k

I use sed for this. If you can specify how you want to convert the names, I can be more specific. It's just regular expressions, which you should learn anyway.

ADD COMMENT • link 12.8 years ago by Ketil 4.2k