Just want to know if I can get Variant Calling directly from two or more reads fastq files without reference genome? I need to compare SNPs between the two fastq files.
Just want to know if I can get Variant Calling directly from two or more reads fastq files without reference genome? I need to compare SNPs between the two fastq files.
No. Variant calls are based on a reference genome sequence. Technically you could assemble the raw reads into a draft reference and variant call from that, but you'll still have to obtain some reference before you can perform variant calling.
EDIT: I may be incorrect in the case that you're working with well-characterized bacterial species and are looking for specific features. Check out this paper describing the KVarQ program. I imagine the same would be true if you're working with something like mitochondrial data.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You can certainly de-novo assemble them both, map the reads of each to the other's assembly, and get variant calls. Without a reference for coordinates, I'm not sure how useful that would be.
Thanks for answering my question. I just hope that VCF between two testing genomes can be obtained without reference genomes.
You can obtain VCF files using the method that I describe. However, as Dan notes, they will not necessarily be useful.
I think it would be helpful if you explained what organisms you are working with, what kind of data you have (the complete experimental setup), and what you are trying to accomplish. Blinded questions rarely yield useful results.
Yes, thanks for the suggestion. The goal of VCF calling is to find recombinants from two bacterial genomes, which were sequenced as raw fastq files. Since there will be large amounts of genomes in comparision for recombinants, it would be convenient to get recombinant information directly from VCF calling, instead of assembling whole genomes.
So, I'm confused as to why you have only 2 fastq files if you have a large number of recombinants... is this one species? 2 species? Are you combining lots of samples in a single library?
The two fastq files contain multiple reads from sequencing, each covering the whole genome of a sample. Thanks!