Hello, we're getting RNAseq data from increasing numbers of "emerging" model species for which little is known. The data often wasn't meant to provide population genetic estimates, but could perhaps be used to provide some.
Is it possible to estimate Ne from the SNP patterns that can be identified from:
- a single. sequence pooled from many individuals from a single population?
- a single diploid genome?
- RNAseq data where multiple individual from a single population (siblings or not) were independently sequenced.
Cheers, yannick
A somewhat related question, but specific to pools is here.
I am not an expert, but I think the biggest issue you will have is that to have absolute Ne estimates, you need SNP data from the neutrally evolving part of the genome. If one does whole genome sequencing and then SNP calling, most of the SNPs are going to be neutral or nearly-neutral. If you are sequencing RNAseq, a lot of the SNPs you gather will be under purifying selection, and the Ne estimates, which are based mostly on allele frequencies, will have an unavoidable ascertainment bias. Someone else from the related question you pointed might be able to give more details.
Could you please elaborate on the reasoning? Why do you think it is possible, in principle, to estimate the effective population size from RNAseq data?