Hi Group,
I would like to calculate theta watterson using genome-wide SNPs. I am using PopGenome software. I find this software is very useful for my study, but very difficult to work with.
I have 10291 SNPs distributed on 17 chromosomes. How do I upload?
I could upload my data in the VCF format, but unable to calculate Theta Watterson. I would like to use sliding window approach to perform nucleotide diversity.
Could you please suggest me suitable code? In addition, could you please send me a link to download Arabidopsis sub-directory and SNP input format?
Here are the codes I used:
GENOME.class<-readVCF("Bc_test.vcf.gz",tid="1",from=1,to=10291,approx=FALSE,gffpath=FALSE,numcols=10291) # for tid=" 1:17", I want to use SNP from all the 17 chromosomes
get.neutrality((GENOME.class,theta=TRUE,stats=TRUE))
get.neutrality(GENOME.class)[[1]]
Tajima.D n.segregating.sites Rozas.R_2 Fu.Li.F Fu.Li.D Fu.F_S Fay.Wu.H Zeng.E Strobeck.S
1 - 10291 0.7972621 1 NA 0.6281656 0.3860124 NaN NaN NaN NaN
#Sliding widow
GENOME.class.slide<-sliding.window.transform(GENOME.class,width=10000,jump=10000)
| : | : | 100 %
|
Error in SLIDE.POS[[zz]] <- window + snp.begin :
attempt to select less than one element
Thanks for you help. How can I get theta Watterson per site?
I'm not sure I understand your question. If you are asking how to calculate a different Watterson's Estimator for each polymorphic site, that is not really reccomended because watterson's estimator is an estimate of the mutation rate for an average site in a genome/gene, the estimate does not differ site by site. The calculation requires multiple polymorphic sites.
If you are asking how to just return the Watterson's theta values for your genes/sliding windows, then just use the
get.neutrality()
function to return the stats.Hope that helps!