How to combine multiple snp array datasets on different versions of same array design (GSA) ?
1
2
Entering edit mode
3.6 years ago
lax ▴ 30

Hi all, Greetings! I plan to do genetic association analysis for my study (case-control) design. I have three different datasets (plink ped/map format) from different genotyping centers, containing separate individuals all suffering from a common disease.

Data1 = X number cases total (A number cases, B number controls)  genotyped on Illumina GSA-V1 Data2 = Y number cases total (C cases, D contros) -- genotyped on Illumina GSA-V3 Data3 = Z number cases total (E cases, F controls) --genotyped on Illumina GSA-MD -V3

I need to combine the above datasets and run the association testing for combined cases against control. (It is not a meta-analysis)

My main concern here is regarding merging the three datasets genotyped on different versions of the GSA array. Q1: Can I treat the three versions of GSA array as same array design OR do I treat them as separate designs? Q2: Can I go ahead with a simple merge first (it definitely won’t be a simple merge) but what I mean is can I first merge all the three datasets presuming same/similar array design and then go ahead with further QC steps and downstream analysis? I have been doing lots of reading about this but it’s making me even more confused.

Extremely sorry if my questions sound naïve, but I am quite new to this field and still a lot to learn. Any suggestions would be greatly appreciated. I plan to do my analysis in plink/R. Thank you all in advance.

Lax

array different GSA snp version • 1.7k views
ADD COMMENT
0
Entering edit mode
2.1 years ago
Jelle • 0

Hi Lax,

I would love to hear what your solution was to this issue, since I ran into the exact same problem.

Best, Jelle

ADD COMMENT

Login before adding your answer.

Traffic: 2288 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6