I wanted to compare the SNPs variation due to transcription. So, I called SNPs from RNA-Seq of 321 samples of two different tissues. Please guide me which software I need to use and which pipeline needs to follow to know the variations due to transcription compare the SNPs of DNA ( I have the DNA-SNPs data from the same 321 samples).
How I can see the difference/explain the differences among GAWS-SNP target traits are common in two tissues. For example, I used two tissues and I found some target traits are commonly targeted by GWAS (SNP+expression+phenotype) in two tissues.
One of the major challenges to this analysis is that different SNP finding algorithms will yield drastically different results. I'd recommend using the GATK best practices for your DNA-seq (they have different protocols for WGS and WES), and for your RNA-seq. Using their pipelines to call SNPs will reduce some potential variance.
They also have pipelines to compare SNPs from two samples, such as Mutect2. Essentially, I would treat your RNA-seq as a "tumor" sample and your DNA-seq as a "normal" sample, run the analysis to identify SNP calls that are unique to the RNA-seq, then run the analysis again flipping your "tumor" and "normal" samples, to find SNPs that are only identified in the DNA-seq.
Without knowing what your end goal is, I'd give you a word of caution about reading too much into SNP data from RNA-seq. RNA processing can lead to inaccurate RT and misincorporation of nucleotides, leading to erroneous SNP calls (but if that's your goal then good luck).
As for your second question, I'm not sure I understand what you're asking. Could you try rephrasing it or providing an example?
For example, I was initially concerned by Figure 1A in the RNA-MuTect paper until I realized those were unfiltered variant counts (and that is the justification for needing to the "vast majority" of raw variants from MuTect2).
I haven't tested the code (although I just added that to my "To-Do" list, at least for RNA-Seq without paired DNA-Seq), but the RNA MuTect code is available here
Many thanks for your kind suggestions/guidance, here my purpose is to do the GWAS using SNPs as genotypes and plant phenotypic treats (5 yrs) as a Phenotype. So, you can say it is GAWS analysis GBP (genotype by phenotype). These SNPs I called using the GATK pipeline from RNA-Seq data. I used two tissues such as shoot apical meristem (SAM) and Silique of Brassica napus Plants. As my both tissues are developing ones, So I believe I can find some really useful single, double and triple nucleotides polymorphisms due to transcriptional changes and relate them with phenotypic traits. SNPs from DNA also called using the GATK approach.
My second question was some phenotypic traits (targets) were common in both tissues. As we have the mRNA expression-based GWAS so it is possible some genes are expressed in both tissues and finally targeted by SNPs. How I can explain the common traits? I think to find some differential SNPs that can change the codon and then amino acids then protein of target traits.
I agree with the message of caution.
For example, I was initially concerned by Figure 1A in the RNA-MuTect paper until I realized those were unfiltered variant counts (and that is the justification for needing to the "vast majority" of raw variants from MuTect2).
I haven't tested the code (although I just added that to my "To-Do" list, at least for RNA-Seq without paired DNA-Seq), but the RNA MuTect code is available here
https://zenodo.org/record/2620062
and my thoughts from reading the paper (but without testing the code) are here:
http://cdwscience.blogspot.com/2019/06/considerations-for-somatic-mutations.html
Dear Charles Warden,
Many thanks for your kind contribution and Suggestions!
Dear Shawn,
Many thanks for your kind suggestions/guidance, here my purpose is to do the GWAS using SNPs as genotypes and plant phenotypic treats (5 yrs) as a Phenotype. So, you can say it is GAWS analysis GBP (genotype by phenotype). These SNPs I called using the GATK pipeline from RNA-Seq data. I used two tissues such as shoot apical meristem (SAM) and Silique of Brassica napus Plants. As my both tissues are developing ones, So I believe I can find some really useful single, double and triple nucleotides polymorphisms due to transcriptional changes and relate them with phenotypic traits. SNPs from DNA also called using the GATK approach.
My second question was some phenotypic traits (targets) were common in both tissues. As we have the mRNA expression-based GWAS so it is possible some genes are expressed in both tissues and finally targeted by SNPs. How I can explain the common traits? I think to find some differential SNPs that can change the codon and then amino acids then protein of target traits.