Entering edit mode
2.7 years ago
ManuelDB
▴
110
If reads have been aligned in the alignment process, why do we need a reference genome when calling variants (for example with FreeBayes)? This is one of the things I never expected to see before exploring NGS bioinformatic pipelines for calling variants from FASTQ files.
here is the sequence of a read:
can you tell me if there is a mutation, and where is it ? without reference.
Thanks Pierre,
I thought stupidly that that information is in the BAM file when the read is aligned.
The information is in the BAM file for each standalone read (not as an identified SNP). It is in the form of how nucleotides (if any) in the read differ from the reference region the read aligned to.
To get this BAM file, one needs to start with a reference, index it and then align sequence data to that index, isn't that the case?