Entering edit mode
4.0 years ago
doinelpierrot
▴
50
Hello all,
I would like to run a blastx on a fasta file in order to perform a quick functionnal annotation on swissprot.
So I ran succesfully :
blastx -query "${QUERY}" -db "${BANK}" -out ${OUT_FILE} -max_target_seqs 1 ${BLAST_PARAM} -num_threads $NCPUS -outfmt "6 qseqid sseqid sacc pident length evalue bitscore"
However the output for my first line looks like this :
TRINITY_DN80826_c0_g1_i1 PIAS1_HUMAN PIAS1_HUMAN 58.879 107 2.52e-36 132
First don't understand why there is no hit accession in column 3 and : it is supposed to be given by the option sacc, and also there is one in blast web version !
Second, when I run web blast on web version I have as a result in description column :
RecName: Full=E3 SUMO-protein ligase PIAS1; AltName: Full=DEAD/H box-binding protein 1; AltName: Full=E3 SUMO-protein transferase PIAS1; AltName: Full=Gu-binding protein; Short=GBP; AltName: Full=Protein inhibitor of activated STAT protein 1; AltName: Full=RNA helicase II-binding protein [Homo sapiens]
How to have this information command line and not only a gene name like I do ?
Thank you for your help !
Take a look at the options for
-outfmt 6
and make changes to options you are adding. Full list of options is here.things like
PIAS1_HUMAN
are the swissprot accession IDs , so you actually got what you asked for. For more informative info do what @genomax said and change the output you request from the blast run (eg. usesallacc
or the default output )I have tried sallacc and it gives me also the swissprot accession. Is there any way to also get the ncbi accession (for instance O75925 for PIAS1_HUMAN) ?
is that a DB you created yourself locally?
if so (and seeing that adding sallacc only returns the swissprot ID) it might be you did not format it correctly ...
Does your db actually include such info? Maybe there's a map file you can later join with the blast output?