Extract sequences from BLAST database in base on the name of the protein
0
0
Entering edit mode
16 months ago
Giffredo ▴ 10

Hi,

I would like to create a new sub-database from the nr BLAST DB containing all the sequences related to biogenic amines. So, I need a script to extract sequences from the nr BLAST database based on partial protein names.

Example: for the word "phosph" I would like to reach the fasta output like that:

   >VFG037176(gb|YP_001844723) (plc) phospholipase C [Phospholipase C (VF0470)] [Acinetobacter baumannii ACICU]
    MNRREFLLNSTKTMFGTAALASFPLSIQKALAIDAKVESGTIQDVKHIV...
    >VFG037177(gb|YP_001846906) (plc) phospholipase C [Phospholipase C (VF0470)] [Acinetobacter baumannii ACICU]
    MITRRKFLNYSLNMGFGAAALAAFPSSIQKALAIPANNKTGTIQDVEHV...
    >VFG037203(gb|YP_001847849) (plcD) phosphatidylserine/phosphatidylglycerophosphate/cardiolipin synthase [Phospholipase D (VF0469)] [Acinetobacter baumannii ACICU]
    MAQSFHSKQLQTHQLANGFLIKASIVVCSSFAVALTGCSTLPKHSPEPI...
BLAST • 331 views
ADD COMMENT

Login before adding your answer.

Traffic: 1686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6