I try to use blastx compare Trinity.fasta file with swissprot db (sp) and Uniref90 using Linux base. SO, I have to download swissprot db (sp) and Uniref90 in fasta file from http://www.uniprot.org/downloads
After download I got uniref90.fasta.gz.part and I can not extract this file. How can I extract this file?.
If you have some guidance about functional annotation using Trinity fasta file please let me know.
You are not able to extract it as it has not been downloaded completely. Even your file name suggests that it is a part of file (uniref90.fasta.gz.part) and not complete.
once you are able to download the complete file and extract it, you have to first format the database using makeblastdb:
makeblastdb -in uniref90.fasta -dbtype 'prot'
Then you can use blastx with mandatory arguments like:
Dear toralmanvar,
Can I ask again about uniprot_sprot.fasta (sp). I downloaded this fasta file from same website above but when I run follow this command
why is showed ;
BLAST Database error: No alias or index file found for protein database [/run/media/hscience/DATA_CentOS/DATABASE/db/uniPROT/uniprot_sprot.fasta] in search path [/run/media/hscience/DATA_CentOS/Bluberry/RNAseqblueberry/4_RSEM_edgeR/edgeR.genes.dir/P1e-10_C8/PickIsoform_Echota_UP/Swissprot::]
Do you think it about uniprot_sprot.fasta or some think wrong?
Have you formatted your database as I instructed in my previous answer to your query?
You are getting this error as the blast is not able to find formatted database. So please format the database using makeblastdb program:
makeblastdb -in uniref90.fasta -dbtype 'prot'
It will result in the generation of 3 files having extension uniref90.fasta.phr, uniref90.fasta.pin and uniref90.fasta.psq.
Once it is generated you can use this formatted database for blast.
Remember you have to use database name which you get after formatting. In above example case, it will be uniref90.fasta (i.e name before .phr, .pin and .psq extension)
Thank you so much for your information
Dear toralmanvar, Can I ask again about uniprot_sprot.fasta (sp). I downloaded this fasta file from same website above but when I run follow this command
QUERY=Trinity01052018_Echota-UP_fasta_iso_ID.fasta
DB=/run/media/hscience/DATA_CentOS/DATABASE/db/uniPROT/uniprot_sprot.fasta
FORMAT="6 qseqid sseqid evalue stitle" EVALUE=1.0e-5 QUERY_CODE=1 MAX_TARGET_SEQ=1 NCPU=4 Home_blastx=/usr/local/bin/blastx OUTF=
basename $QUERY
_basename $DB
_blastx_fmt6.txtblastx -query $QUERY \ -db $DB \ -evalue $EVALUE \ -query_gencode $QUERY_CODE \ -max_target_seqs $MAX_TARGET_SEQ \ -num_threads $NCPU \ -outfmt "$FORMAT" \ -out $OUTF
why is showed ; BLAST Database error: No alias or index file found for protein database [/run/media/hscience/DATA_CentOS/DATABASE/db/uniPROT/uniprot_sprot.fasta] in search path [/run/media/hscience/DATA_CentOS/Bluberry/RNAseqblueberry/4_RSEM_edgeR/edgeR.genes.dir/P1e-10_C8/PickIsoform_Echota_UP/Swissprot::]
Do you think it about uniprot_sprot.fasta or some think wrong?
Thank you Kan
Have you formatted your database as I instructed in my previous answer to your query? You are getting this error as the blast is not able to find formatted database. So please format the database using makeblastdb program:
It will result in the generation of 3 files having extension uniref90.fasta.phr, uniref90.fasta.pin and uniref90.fasta.psq.
Once it is generated you can use this formatted database for blast. Remember you have to use database name which you get after formatting. In above example case, it will be uniref90.fasta (i.e name before .phr, .pin and .psq extension)