QC of genetic data
0
0
Entering edit mode
14 months ago
kl ▴ 10

Hi,

I have some genetic data in a bim file.

The chromosomes range from 0 to 23 and 26, which I have not come across before. Would the SNPs on chromosome 0 and 26 be removed from the genetic file or left in. Then, I have some SNPs which have a GSA (which the genotyping array) before the rsid but some appear normal without the GSA prefix, see below

19  GSA-rs117797881 88.57582    51991438    G   A

vs

19  rs7259137   88.47165    51972314    G   A

The SNPs on chr 26 look like this

26  MTReverseDLOOP_61   100 16399   G   A

I am tempted to remake the file including only chromosomes 1-23 and removing the GSA prefix as the position and chromosome align with the correct SNP identifier (Hg38). Would leaving the Chromosome 26 in have any impact on imputation?

Would really appreciate advice/reassurance.

Thanks

PLINK • 840 views
ADD COMMENT
0
Entering edit mode

Would the SNPs on chromosome 0 and 26 be removed from the genetic file or left in.

Yes, you can leave out chr26 and chr0.

I have some SNPs which have a GSA (which the genotyping array) before the rsid but some appear normal without the GSA prefix

You can swap the rsid and GSA-rsid with chrom:position using Plink. It can be later re-annoted into rsid using Plink.

Would leaving the Chromosome 26 in have any impact on imputation?

If you use public resource like Michigan or TOPMed Imputation Servers, they won't take your chr26 input and impute it.

ADD REPLY
0
Entering edit mode

Thanks for the advice. What about the third column (GS in plink). I am used to it being 0? And last question, what does it mean when there is a "." in the place of an allele. For example, one allele is named but the other allele is a dot.

ADD REPLY
0
Entering edit mode

What about the third column (GS in plink). I am used to it being 0?

The third column in Plink .bim file is position in morgans or centimorgans (safe to use dummy value of '0').

And last question, what does it mean when there is a "." in the place of an allele. For example, one allele is named but the other allele is a dot.

I would get rid of that particular SNP completely from the data using --exclude function in Plink.

ADD REPLY

Login before adding your answer.

Traffic: 2362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6