Hello all, I cannot figure out what makes a "correct" GI list!
Atm I have a list that I produced this using the query below and a awk line to just retrieve the second column values
blastn -word_size 11 -reward 2 -penalty -3 -gapopen 5 gapextend 2 -query hiv_reference.fa -db test -outfmt 6 >results.txt
awk "print $2" results.txt > giResults.txt
G3ME69S02DME82 G3ME69S02EPWTE G3ME69S02DE0PT G3ME69S02C1ABI G3ME69S02EQ3LK G3ME69S02ELVHG G3ME69S02CY21D G3ME69S02D9A53
So that is what I get from running those two lines, and from what I understand those are GI values, so why can I not do
blastdb_aliastool -gilist giResults.txt -dbtype nucl -db test -out test_subset
I get an error that its not a valid GI list.
Any ideas?
Any input is much appreciated
Maybe this topic will help you (especially my comments ;) ).
http://biostar.stackexchange.com/questions/1196/extracting-sequence-from-a-3gb-fasta-file