Hi, is it possible to detect eQTL with RNA-seq analysis without having a reference genome and without sequencing all the genomes (i have a lot of samples, it would be too expensive and time-consuming)?
Hi, is it possible to detect eQTL with RNA-seq analysis without having a reference genome and without sequencing all the genomes (i have a lot of samples, it would be too expensive and time-consuming)?
It's technically possible, but a pretty difficult question to answer. Firstly, your approach will depend on if you have paired DNA sequencing too, or if you want to infer genotypes from the RNA Seq data.
Secondly, you should consider if this is a worthwhile endeavour based on the number of samples you have. To reliably call eQTLs, you generally want a relatively large number of samples. Then again, it depends on how genetically diverse your organism is.
Genotypes: If you're inferring Genotypes from RNA Seq data, and it's a non-model organism, you're going to have trouble. There are approaches in the GATK best practises, but splicing makes it very very tricky, even in human samples. If you're going from DNA Sequencing, then you might have a potential shot at getting it working, using Samtools, and an assembled reference.
Reference: As you don't have a model reference to work from, then you'll have to do a denovo assembly using something like Trinity. This will be by no means perfect, but it's at least something.
RNA Seq: If you're inferring genotype from this alone, then you'll need quite a bit of depth to get away with it. This is a lot of work to take on, specifically if all you have to work with is a limited number of RNA Seq samples. You'll need to; De Novo assemble a genome to make your reference, genotype using a mix of Samtools and GATK (I'm not 100% convinced this will work at the split 'n' trim stage without a good model reference, or if GATK would work properly).
Conclusion: The more I think about it, the more you're going to hit a lot of road blocks. I doubt this will work with RNA seq data alone.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Andrew,
nice explanation, Thanx for that.
I am thinking of doing the same kind of stuff and don't know whether it is possible. The difference in my case is I have raw genome sequence available. Since I am working on a non-model organism, there is no annotation or poor annotation available.
Can you please explain what would be the scenario in this case? The reason for this analysis is I am seeing lots of gene variation in three different strains of the specimen at strain level rather than treatment level. So I am trying to find explanation for this situation using sequence variation in these three different strains. I have pooled SNP's using GATK from all three strain taking raw genome sequence as a reference. But I still can not find strain wise difference. This obviously leads me to look at any eQTL detection if possible.
Do you have any other suggestion for this question? Any help is much appreciated!! THanks AMoL
Is the DNA sequencing paired with the RNA sequencing? Detecting strain specific eQTLs is not easy, but I've done something similar using a Multinomial Log-LM. The issue then all comes down to two parts; how good your draft organism assembly is, and what your power is - sample size is by far the most crucial in this case. Here's what I'd do if I were in your situation (and be sure you want to do this because it'll be quite a bit of work!):