Question

Estimating Effective Population Size (Ne) From Rnaseq Data

1

Entering edit mode

13.8 years ago

Yannick Wurm ★ 2.5k

Hello, we're getting RNAseq data from increasing numbers of "emerging" model species for which little is known. The data often wasn't meant to provide population genetic estimates, but could perhaps be used to provide some.

Is it possible to estimate Ne from the SNP patterns that can be identified from:

a single. sequence pooled from many individuals from a single population?
a single diploid genome?
RNAseq data where multiple individual from a single population (siblings or not) were independently sequenced.

Cheers, yannick

A somewhat related question, but specific to pools is here.

population next-gen sequencing • 3.7k views

ADD COMMENT • link updated 6.7 years ago by Ben Sutherland ▴ 20 • written 13.8 years ago by Yannick Wurm ★ 2.5k

1

Entering edit mode

I am not an expert, but I think the biggest issue you will have is that to have absolute Ne estimates, you need SNP data from the neutrally evolving part of the genome. If one does whole genome sequencing and then SNP calling, most of the SNPs are going to be neutral or nearly-neutral. If you are sequencing RNAseq, a lot of the SNPs you gather will be under purifying selection, and the Ne estimates, which are based mostly on allele frequencies, will have an unavoidable ascertainment bias. Someone else from the related question you pointed might be able to give more details.

ADD REPLY • link 11.7 years ago by 14134125465346445 ★ 3.6k

0

Entering edit mode

Could you please elaborate on the reasoning? Why do you think it is possible, in principle, to estimate the effective population size from RNAseq data?

ADD REPLY • link 13.8 years ago by User 1940 ▴ 80

score 1 · Accepted Answer · 2018-08-12

It seems that people use RNA-seq or exome capture data for demographic inference by using synonymous mutations, thereby getting around the issue of non-synonymous mutations being non-neutral. Therefore, it should be similarly possible to use your RNA-seq data to estimate Ne (your third point above) if you use the synonymous mutations only?

For example, see:

Fraïsse C, Roux C, Gagnaire P, Romiguier J, Faivre N, Welch JJ, Bierne N. (2018) The divergence history of European blue mussel species reconstructed from Approximate Bayesian Computation: the effects of sequencing techniques and sampling strategies. PeerJ 6:e5198 https://doi.org/10.7717/peerj.5198