Entering edit mode
7.0 years ago
Sharon
▴
610
Hello Everyone
In GATK pipeline for RNAseq variant calling, why we get “The error message is as follows: Unsupported CIGAR operator N in read ...” in both indel realignement and base recabliaration steps?
And is it safe to use –filter_reads_with_N_cigar
in the indel realignement and base recalibration
if we already used -U ALLOW_N_CIGAR_READS
in SplitNCigarReads steps.
Thanks
Why did you not do DNA-seq in order to call variants?
You already posted about this general topic, here: Suggested pipelines for finding somatic mutations using RNAseq in normal and tumor cells
I'm not sure that you'll find that much support for this topic here, as most would not choose to call variants from RNA-seq data. I'm aware that there are published methods on this, but there are published methods on virtually all topics, which doesn't necessarily allude to their utility due to the fact that publishing itself has become so commercialised.
We have whole genome and exome sequencing, we did variant calling on it. Then my professor asked to do rna seq and get differential expressions, which I did too. Then he asked me to find somatic mutations in tumors using the rnaseq we have. I think he plans to complement what we have from whole genome and exome. What I understand from published papers is that RNAseq can find what DNAseq can't find, but DNAseq also can find stuff can't be detected by rnaseq. I don't have much experience. Appreciate your opinions although I think its hard to convince professors at some points, so I need to understand for the sake of understanding while I might end up doing it any way :)
Yes, certainly, but even they must realise that they still can learn a lot, both from other published studies and also from their own juniors, who may have more experience with novel methodologies and programming languages.
Sure, and hopefully :) Do you suggest any read on this that says problems of RNAseq so I can learn more myself before I can argue?
Well, it has just taken me a few seconds to find the primary problem with calling variants from RNA-seq:
[source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3791257/]
The lower number in WES is explained by the fact that most WES library prep methods don't actually target all exons. The less than half of the variants encountered in RNA-seq compared to WGS is obvious, as one targets mRNA, whilst the other targets DNA, respectively.
Great, thanks Kevin, I will read this. Thanks a lot for your explanations and usual help. Much appreciated.