RNA-Seq vs DNA-Seq variant calling
0
1
Entering edit mode
6 months ago
Esraa ▴ 10

Hello, I have been previously posting about finding and testing benchmarking datasets for different RNA-Seq variant calling pipelines, and upon using the giab HG002 and HG005 datasets i had a problem with the results as the hap.py F1 Scores showed almost no correlation with the DNA variants truth sets, even though many people have been benchmarking RNA variants against DNA truth sets. I have been searching for the cause ever since and came across a benchmarking strategy used by several studies, where they restrict the scoring to a certain coverage or certain feature (e.g. CDS).

My question is, should i depend on this benchmark strategy? It has surely improved my hap.py scores but i am afriad it could have any bias effects that i am not aware of, and if so, on what basis shall i decide the coverage to filter my regions with, i am going to be focusing on CDS/exon variants.

Thank you all so much in advance.

benchmark dna-seq rna-seq giab vcf • 234 views
ADD COMMENT

Login before adding your answer.

Traffic: 1897 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6