Sequence extraction from a fastq file
1
1
Entering edit mode
10.0 years ago
bpz ▴ 60

Hello everyone

I have some Ilumina reads from a metagenomic project, in a fastq file and I am trying to "fish out" some sequences in particular from all the mess. I conducted a blast search of this and I got the sequences I am interested in, in fasta format. The thing is, I need the sequences in fastq format for assembly. How can I extract the sequences from the original fastq file using the blast fasta file as a reference? or should I just convert my output blast file in fasta format to fastq format?

Thanks in advance.

sequencing Assembly sequence blast alignment • 6.2k views
ADD COMMENT
0
Entering edit mode

Great, I will try it.

Thanks

ADD REPLY
0
Entering edit mode

Are you sure you want the whole sequence from the Fastq file? If you have quality trimmed your reads and/or are really just interested in the parts matching the reference sequences then the grep approach may not be what you want.

ADD REPLY
0
Entering edit mode

Yes, you are right. My solution will only work if he used "Read ID" or "Header info" which I assume has been preserved between fasta and fastq files.

ADD REPLY
0
Entering edit mode

Right, the IDs would need to be the same, though that is not what I was referring to. I meant that if you quality trim a file and want to extract those reads from another file (or just keep the blast query string), then pulling reads from a file with the IDs alone won't work (in that case, the trimming and match information would be lost). Hopefully that is clear.

ADD REPLY
2
Entering edit mode
10.0 years ago

So you are trying to say that that you have header lines for those sequences and the sequence itself (fasta format) but you want to add quality scores from the original fastq file. You can simply use grep -A3 "Header_info" Original.fastq and it should give you 3 lines plus header or fastq sequence for that header.

EDIT: Just found this post: Quickest way to extract subset of reads from huge fastq file

ADD COMMENT

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6