Question

Detect heterozygous variant from reads

0

Entering edit mode

2.2 years ago

pablo ▴ 350

Hi,

I am looking at gene cassettes insertion in diploïd yeast samples. There is one gene cassette, which is known as heterozygous in a specific sample (called s0), confirmed by PCR. I did a phased assembly (hifiasm and flye+hapdup) but I found this cassette in both copies of these both assemblies (suggesting it is homozygous). The assemblies are pretty good and contiguous.

So, what I did is looking at the reads directly, which contain the real heterozygous information :

     - I `BLAST` the gene cassette sequence against the s0 reads and I extract the read IDs where >99% idendity
     - I align the s0 reads against the phased assembly of s0, and identify the region where the IDs reads from BLAST get aligned
     - Then, I extract all the reads of this region and did a "proportion" : reads from BLAST / all reads of the region

I get 27% for the haplotype 1 and 29% for the haplotype 2. These threshold are probably too low for the assemblers to distinguish the two phases, considering the region homozygous.

I would like to know if this method is reliable? Because in all of my samples, it is the same problem for each gene cassette (some of them expected heterozygous, but I only have homozygous ones).

Best

alignement blast bam igv • 518 views

ADD COMMENT • link 2.2 years ago by pablo ▴ 350