Question

How to combine multiple snp array datasets on different versions of same array design (GSA) ?

2

Entering edit mode

4.7 years ago

lax ▴ 30

Hi all, Greetings! I plan to do genetic association analysis for my study (case-control) design. I have three different datasets (plink ped/map format) from different genotyping centers, containing separate individuals all suffering from a common disease.

Data1 = X number cases total (A number cases, B number controls)  genotyped on Illumina GSA-V1 Data2 = Y number cases total (C cases, D contros) -- genotyped on Illumina GSA-V3 Data3 = Z number cases total (E cases, F controls) --genotyped on Illumina GSA-MD -V3

I need to combine the above datasets and run the association testing for combined cases against control. (It is not a meta-analysis)

My main concern here is regarding merging the three datasets genotyped on different versions of the GSA array. Q1: Can I treat the three versions of GSA array as same array design OR do I treat them as separate designs? Q2: Can I go ahead with a simple merge first (it definitely won’t be a simple merge) but what I mean is can I first merge all the three datasets presuming same/similar array design and then go ahead with further QC steps and downstream analysis? I have been doing lots of reading about this but it’s making me even more confused.

Extremely sorry if my questions sound naïve, but I am quite new to this field and still a lot to learn. Any suggestions would be greatly appreciated. I plan to do my analysis in plink/R. Thank you all in advance.

Lax

array different GSA snp version • 2.1k views

ADD COMMENT • link updated 3.1 years ago by Jelle • 0 • written 4.7 years ago by lax ▴ 30

score 0 · Answer 1 · 2022-11-01

0

Entering edit mode

3.1 years ago

Jelle • 0

Hi Lax,

I would love to hear what your solution was to this issue, since I ran into the exact same problem.

Best, Jelle

ADD COMMENT • link 3.1 years ago by Jelle • 0