Blast many read sequences
2
1
Entering edit mode
6.5 years ago
samuel ▴ 260

Hi, I have a fastq file which I think contains sequences from different organisms. Is there a way I can blast all of the sequences in the fastq file to find out where what these organisms are??

Thanks in advance.

alignment contamination blast reads • 4.2k views
ADD COMMENT
1
Entering edit mode

I am not able to understand what you are trying to do because your question is not explain properly but I can explain you the steps you can do.

Convert .fastq to .fasta

sed -n '1~4s/^@/>/p;2~4p' file.fq > file.fa

Use BLAST Command Line Application for fasta file

manual

ADD REPLY
0
Entering edit mode

I have added/removed tags to keep the post relevant

ADD REPLY
3
Entering edit mode
6.5 years ago

Blasting reads sounds like a bad idea considering small lengths and the number of reads. May be you can shuffle few thousand reads (seqkit?) and then try it, however, I will suggest using fastq-screen to map reads on the genomes of organims that you suspect to be present in your raw data.

ADD COMMENT
0
Entering edit mode

+1 for "fastq-screen"

ADD REPLY
3
Entering edit mode
6.5 years ago

You need taxonomic profiling softwares, like Kraken and Kaiju . BLAST is the slowest for this kind of task.

ADD COMMENT
0
Entering edit mode

+1 for that. That ll surely help OP

ADD REPLY

Login before adding your answer.

Traffic: 1032 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6