Why is needed a genome reference when calling variants
0
0
Entering edit mode
2.7 years ago
ManuelDB ▴ 110

If reads have been aligned in the alignment process, why do we need a reference genome when calling variants (for example with FreeBayes)? This is one of the things I never expected to see before exploring NGS bioinformatic pipelines for calling variants from FASTQ files.

NGS • 1000 views
ADD COMMENT
5
Entering edit mode

here is the sequence of a read:

CTTCAACAACGTCCACTCTTTCTGGAAAATCAATTGGTAGGAGAGAACAGTACATTTCACCATATGCAGA

can you tell me if there is a mutation, and where is it ? without reference.

ADD REPLY
0
Entering edit mode

Thanks Pierre,

I thought stupidly that that information is in the BAM file when the read is aligned.

ADD REPLY
0
Entering edit mode

I thought that that information is in the BAM file when the read is aligned.

The information is in the BAM file for each standalone read (not as an identified SNP). It is in the form of how nucleotides (if any) in the read differ from the reference region the read aligned to.

To get this BAM file, one needs to start with a reference, index it and then align sequence data to that index, isn't that the case?

ADD REPLY

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6