Creating snpmatrix object from a flat file

0

Entering edit mode

9.5 years ago

Paula Sanchez • 0

Dear all,

I am pretty new to genomics and I just received a genotype file. I was using other commands in R and it was too slow. I have decided to use SnpStats package, but I am not being able to read my file.

My file is a dataframe file with 10,000 rows (animals) and 600,000 columns (SNPs) coded as 0,1 and 2. I found several functions to transform it to SNPstats object, but all of them do not apply to my case e.g. read.snps.long is for one call per row, etc.

Any help for me to get started?

Thanks in advance.

SNP R r • 3.0k views

ADD COMMENT • link updated 2.0 years ago by Ram 44k • written 9.5 years ago by Paula Sanchez • 0

0

Entering edit mode

What is the objective of your analysis?

ADD REPLY • link updated 2.0 years ago by Ram 44k • written 9.5 years ago by alesssia ▴ 580

0

Entering edit mode

I want to create the genomic relationship matrix, PCA and genomic predictions. Thanks.

ADD REPLY • link updated 2.0 years ago by Ram 44k • written 9.5 years ago by Paula Sanchez • 0

1

Entering edit mode

If you want to use SNPstats you should format the data as pedigree file, or as a PLINK file, that is a kind of standard for genomic analysis. To transform the file you should master a bit of scripting (in any language: R, bash, python...). However, to the best of my knowledge SNPstats only deal with diallelic data.

There are other software that allow you to generate a GRM (e.g., GCTA, PLINK, LDAK) and some that allow you to evaluate the PCA (e.g., PLINK). However, I think that all of them require diallelic data (but it is worth checking).

ADD REPLY • link updated 2.0 years ago by Ram 44k • written 9.5 years ago by alesssia ▴ 580

Login before adding your answer.