Question

change the name of fasta files

0

Entering edit mode

8.5 years ago

wu.zhiqiang.1020 ▴ 50

Dear all.

I have a big Fasta file with complicated name of the sequence as:

>scaffold:ChrPicBel3.0.1:JH584390.1:1:2133925:1 scaffold JH584390.1
GAAATGCTCTTTTCTTCATTTAACCTTATATTTAATACACCTTTTAAATGTTTCTCAATT
TTTTTATTCTTTAATAATATGACAAACTAGACCTTTAAAATCATCTCTCCTTCCTAAATC

I just want to keep the last letter as the name as:

>JH584390.1
GAAATGCTCTTTTCTTCATTTAACCTTATATTTAATACACCTTTTAAATGTTTCTCAATT
TTTTTATTCTTTAATAATATGACAAACTAGACCTTTAAAATCATCTCTCCTTCCTAAATC

Please give me some suggestion. thanks

ZQ

genome • 3.6k views

ADD COMMENT • link updated 8.5 years ago by Farbod ★ 3.4k • written 8.5 years ago by wu.zhiqiang.1020 ▴ 50

0

Entering edit mode

You might want to look into using Biopython. See these pages:
http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial002.html
http://biopython.org/wiki/SeqIO
http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc11
http://biopython.org/DIST/docs/api/Bio.SeqIO-module.html

It just so happens that I've posted some code examples using this package for fasta parsing here and here which might be helpful to get started; you might want to modify the record.id value with .split() (example) or a regular expression of some sort (docs here and here) and append the output to a new fasta file.

ADD REPLY • link 8.5 years ago by steve ★ 3.5k

score 2 · Answer 1 · 2016-10-14

2

Entering edit mode

8.5 years ago

GouthamAtla 12k

This has been asked may be more than 20 times on biostars. You could check the previous posts.

ADD COMMENT • link 8.5 years ago by GouthamAtla 12k

0

Entering edit mode

thanks! I will do deep seach

ADD REPLY • link 8.5 years ago by wu.zhiqiang.1020 ▴ 50

1

Entering edit mode

Anyways:

awk '{ if ( $0 ~ /^>/ ) { print ">"$NF } else { print } }'  in.fasta > out.fasta

ADD REPLY • link 8.5 years ago by GouthamAtla 12k

0

Entering edit mode

thanks for this simple and quick command

ADD REPLY • link 8.5 years ago by wu.zhiqiang.1020 ▴ 50

0

Entering edit mode

Hi dear Goutham Atla,

Very nice script. you are a bash scripting expert!

Thank you

ADD REPLY • link 8.5 years ago by Farbod ★ 3.4k

0

Entering edit mode

Dear Goutham Atla, Hi.

Is there any bioawk script for this purpose ? (I have search a little but I did not find any thing)

ADD REPLY • link 8.5 years ago by Farbod ★ 3.4k

score 0 · Answer 2 · 2016-10-14

0

Entering edit mode

8.5 years ago

Farbod ★ 3.4k

Dear wu.zhiqiang.1020, Hi.

maybe this code can help: awk -F" " '{print $NF}' big-header.fasta > short-header.fasta

and be sure you have checked the results

~ Best

ADD COMMENT • link 8.5 years ago by Farbod ★ 3.4k

0

Entering edit mode

This does not help. You will miss the > part of header.

ADD REPLY • link 8.5 years ago by GouthamAtla 12k

0

Entering edit mode

thanks for the tips of this, we can change based on this thanks

ADD REPLY • link 8.5 years ago by wu.zhiqiang.1020 ▴ 50