Hi everyone,
I have HapMAp data+another data set (totally 9 population). I will aply PCA to this data set. I merged the data sets using PLINK, --merge-list. Now, I have mergeddata.bim,mergeddata.bad,mergeddata.fam files.
How can list the overlapping SNPs in nine files in R?
And what/how should I do after I identify the overlapping SNPs?
Note: I am really new in this area and using Linux.
Thank you
Thank you for your answer.
I used read.delim to get data frame from my nine bim files, then I have 9 data frame.
Then I found intersect of them.
common.snps=Reduce(intersect,list(df9[,2],df8[,2], df7[,2],df6[,2],df5[,2],df4[,2],df3[,2],df2[,2],df1[,2]))
Then I used the following command.
write.table(common.snps, file="list.snps", sep="\t", col.names=F, row.names=F, quote=F )
I found the right number of SNPs but the format of file is ASCII text. Now I should check dublicates. But I don't know how.
After I find dublicated Snps and remove them, I will do LD-prunning. How can I prepare the set of overlapping SNPs for that?
You should add this as a comment and not as a separate answer to your question. As I said in my answer your merged file
mergedata.bim
contains the intersect of all the SNPs. There shouldn't be any duplicated SNPs on that merged data set. Maybe check the PLINK website and explore the merge command that you use to see what it does.Thank you very much. I did it as you said. I have a snps.txt file now. Then, I should do LD-prunning.I will try to prune out SNP which has low r2. I guess I need some parameter. How can I define them?
I'm glad it worked. How to prune SNP will depend on what you want to do with the pruned data ? Is it to do a PCA for instance? In that case, this parameter is common:
--indep-pairwise 50 10 0.2
However, this question is not related to your original post and you should either post a new one or try to look for the answer by googling it :)Actually, LD prunning is completed. I used --indep-pairwise 50 5 0.2 . Thank you very much for your suggestions.
I asked another question about the continuation of this topic in other posts. Maybe you can help out there :)
cool, accept the answer then :P