Entering edit mode
3.8 years ago
mthm
▴
50
I have ncbi-blast-2.11.0+ installed on ubuntu. I have an input TE fasta file and a manual curated TE database in fasta format and I want to blast my input data on the manual database with 80:80 rule (80% match on 80% of the length of the sequence), how should I do that?
based on what I understood from the documentation, I tried:
in my case I have merged these two fasta files plus the RepBase TE curated library to build one database, however, the latter one is TE sequences with their headers but the first two are totally different! I don't know if it could work like this or if I have to run two type of blasting on the two type of databases?
how did you merge those two files? did you open them in windows/dos ? If so, try to run dos2unix on the merged file before making the blastDB.
I am using command line in ubuntu, I just simply used
cat
but probably that is not the correct way given the two databases are different in contentcat
should be OK for merging the two fasta files.Can you check if the two files are both in the correct fasta format (header lines with >) and do not have weird chars in them? or post a small extract of those two files so we can have a look?
they are both nucleotide files right?
no in the first database, the beginning of the file is not sequences, but a script to fetch some data from the links. then after a long script, the sequences are like this:
that is however, the second dataset:
well, that's clearly NOT fasta file format.
from the link you posted you can get two fasta files I noticed. Formatting a blast DB works with fasta file input (default options).
See to get fasta format of those files and repeat the process I would suggest
yeah, right. I don't know how I managed to download a different file than the actual fasta!