Hi all,
My question may sound simple. I'm trying to download the plant ref-seq proteins from NCBI to make blast database and run blastx for contigs resulted from de novo assembly of a non-model plant. As there is several taxonomy ID for plants, like flowering plants (3398), green plants (33090), ...please be ware me how I can get all plant ref-seq protein sequence to have as rich as database? Please don't refer me to ftp://ftp.ncbi.nlm.nih.gov/refseq/release/plant/ as it contains mixed refseq sequences, not just protein refseq. Thanks in advance.
Thanks a lot friend. Is there similar command to get the plant protein sequences from Uniprot?
The following query will retrieve all sequences with keyword "Complete Proteome" from the taxonomy group "Viridiplantae".