change the name of fasta files
2
0
Entering edit mode
8.1 years ago

Dear all.

I have a big Fasta file with complicated name of the sequence as:

>scaffold:ChrPicBel3.0.1:JH584390.1:1:2133925:1 scaffold JH584390.1
GAAATGCTCTTTTCTTCATTTAACCTTATATTTAATACACCTTTTAAATGTTTCTCAATT
TTTTTATTCTTTAATAATATGACAAACTAGACCTTTAAAATCATCTCTCCTTCCTAAATC

I just want to keep the last letter as the name as:

>JH584390.1
GAAATGCTCTTTTCTTCATTTAACCTTATATTTAATACACCTTTTAAATGTTTCTCAATT
TTTTTATTCTTTAATAATATGACAAACTAGACCTTTAAAATCATCTCTCCTTCCTAAATC

Please give me some suggestion. thanks

ZQ

genome • 3.4k views
ADD COMMENT
0
Entering edit mode

You might want to look into using Biopython. See these pages:
http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial002.html
http://biopython.org/wiki/SeqIO
http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc11
http://biopython.org/DIST/docs/api/Bio.SeqIO-module.html

It just so happens that I've posted some code examples using this package for fasta parsing here and here which might be helpful to get started; you might want to modify the record.id value with .split() (example) or a regular expression of some sort (docs here and here) and append the output to a new fasta file.

ADD REPLY
2
Entering edit mode
8.1 years ago

This has been asked may be more than 20 times on biostars. You could check the previous posts.

ADD COMMENT
0
Entering edit mode

thanks! I will do deep seach

ADD REPLY
1
Entering edit mode

Anyways:

awk '{ if ( $0 ~ /^>/ ) { print ">"$NF } else { print } }'  in.fasta > out.fasta
ADD REPLY
0
Entering edit mode

thanks for this simple and quick command

ADD REPLY
0
Entering edit mode

Hi dear Goutham Atla,

Very nice script. you are a bash scripting expert!

Thank you

ADD REPLY
0
Entering edit mode

Dear Goutham Atla, Hi.

Is there any bioawk script for this purpose ? (I have search a little but I did not find any thing)

ADD REPLY
0
Entering edit mode
8.1 years ago
Farbod ★ 3.4k

Dear wu.zhiqiang.1020, Hi.

maybe this code can help: awk -F" " '{print $NF}' big-header.fasta > short-header.fasta

and be sure you have checked the results

~ Best

ADD COMMENT
0
Entering edit mode

This does not help. You will miss the > part of header.

ADD REPLY
0
Entering edit mode

thanks for the tips of this, we can change based on this thanks

ADD REPLY

Login before adding your answer.

Traffic: 1871 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6