Question

Reformatting Fastq to Fasta for Metagenomic Analysis

0

Entering edit mode

5 months ago

echolley ▴ 20

Hi there, I'm trying to use VirFinder with shotgun metagenomic sequencing reads. My reads came in FASTQ format, and VirFinder requires a Fasta input format. I've read you can use tools like BBMap reformat.sh, but these don't map the reads. If I'm using metagenomic data, is this an appropriate way to reformat into a fasta? Do I need the reads to be mapped?

Additionally, what is the difference between unmapped and mapped reads. Thanks!

metagenomics fasta fastq • 480 views

ADD COMMENT • link updated 5 months ago by colindaven 7.0k • written 5 months ago by echolley ▴ 20

0

Entering edit mode

5 months ago

colindaven 7.0k

Another option to convert fastq to fasta formats is seqtk

seqtk seq -A 

Usage:   seqtk seq [options] <in.fq>|<in.fa>

         -A        force FASTA output (discard quality)

ADD COMMENT • link 5 months ago by colindaven 7.0k

score 2 · Accepted Answer · 2024-06-12

Fasta and fastq are distinct formats, with the latter having information about the basecall Q-score values for each base.

My reads came in FASTQ format, and VirFinder requires a Fasta input format.

Then you will have to convert the reads into fasta format.

If I'm using metagenomic data, is this an appropriate way to reformat into a fasta?

There is only one way to convert a fastq format sequence read to fasta and reformat.sh will do that as you already note.

but these don't map the reads

VirFinder will likely do that.

Do I need the reads to be mapped?

Depends on what you are trying to do. If a software needs mapped reads then you will have to do that up front.

what is the difference between unmapped and mapped reads

Exactly what those designations say. Reads are either mapped or will not be mapped to a particular set of reference sequences used for the alignments.