Hello.
I want to find a specific gene inside SRA files and I am wondering if i am using a correct way to align those sequences because its the first time that I am "playing" with something like that.
So, here is what I've done till now.
Download .sra
files from NCBI and using fastq-dump program, convert .sra
files into .fastq
using:
fastq --split-spot
Then those .fastq
files convert them into .fasta
using:
awk 'NR % 4 == 1 || NR % 4 == 2' myfile.fastq | sed -e 's/@/>/' > myfile.fasta
Finally, convert .fasta
into blast readable database using:
makeblastdb -in myfile.fasta -dbtype nucl -out myNewdb
Now having ready the Database i can run a blast using:
blastn -query query.fa -db myNewdb -task blastn -dust no -outfmt 7 -num_alignments 2 -num_descriptions 2 nucl, prot
So, is this approach right/correct? Or I have to try it somehow different?
Thanks in advance.