Extract Multiple Genes From Multiple Fasta Files
1
0
Entering edit mode
11.6 years ago
hosseinv ▴ 20

Hi every one,

I was wondering if somebody could tell me how to extract a number of genes from a number of fastafiles (all having the same set of genes) and CONCATENATE them altogether for each fasta file?

Having said that, I've got an excel sheet (or a text file) having the entities (the Start / End nucleotide positions) for every single gene.

I would like to do it using awk and grep in Unix.

Any help is appreciated.

Cheers,
Hossein

fasta • 3.0k views
ADD COMMENT
0
Entering edit mode

If I understand it correctly, think generating a bed file from your text file and after that bedtools getfasta associate with for loop may be a shot.

ADD REPLY
0
Entering edit mode
11.6 years ago
Asaf 10k

Hi Hossein, I think that your best shot will be using Galaxy, you can do all sorts of fasta files manipulations without programming or scripting. Good luck

ADD COMMENT
0
Entering edit mode

Thanks Asaf for your reply. The thing is I am currently working on a small dataset and I want to do this as an example for my next bigger dataset. So I don't really want to go through uploading big data files to Galaxy.

ADD REPLY
0
Entering edit mode

so perl or python can give you the solution. In my opinion, if you don't master awk it's easier to learn perl or python and implement these small scripts than doing it in awk

ADD REPLY
0
Entering edit mode

Glad to hear that. I'll have start learning Perl or Python. So would you recommend me a webpage or textbook for learning Perl please? I appreciate your help. Regards. Hossein

ADD REPLY

Login before adding your answer.

Traffic: 2251 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6