Question

No overlap of Chrome and Chrome End Position in RSID Update

0

Entering edit mode

2.5 years ago

hi.there • 0

So I've moved this question to a new post. I am new to genetic data preprocessing so forgive me if this is a novice mistake.

I've been trying to update rsids on a bim file in ADNI based on chromosome and chromosome end positions.

I've grabbed the RSIDs and chrome and chrome end positions from the ADNI bim file and put them in a separate file.

I have additionally gone to UCSC and gotten every rsid with chrome and chrome positions via command:

curl -O https://hgdownload.gi.ucsc.edu/goldenPath/hg38/database/snp151Common.txt.gz

gunzip -c snp151Common.txt.gz | cut -f 2,4,5,12,17 | grep single.exact | cut -f 1-3 > onlySNPs.tsv

sort -k3 -u onlySNPs.tsv | sort -k1,2 -u > onlySNPs.uniqLocAndId.tsv

I have then tried to join the two files via chrome and chrome positions by concatenating the two fields in both files.

awk 'FNR==NR{a[$1]=$2 FS $3;next}{ print $0, a[$1]}' updatedonlySNPS.uniqLocAndId.txt fromOriginalBim.txt > combined.txt

Unfortunately, there seems to be no third column generated leading to the assumption that there is no overlap of chrome and chrome endposition. I tried a sanity check and searched for a match of the first 30 rows but indeed there are no matches. Does anybody know what I may be doing wrong? I have noticed that in addition to RSIDs in the bim id column there are ids with common variant numbers with a preface of 'CNVI' and ids with a preface of 'MITO'. Any help/education would be deeply appreciated.

plink chrom update rsid • 664 views

ADD COMMENT • link updated 2.5 years ago by Pierre Lindenbaum 165k • written 2.5 years ago by hi.there • 0

0

Entering edit mode

I have then tried to join the two files

use join https://linux.die.net/man/1/join

ADD REPLY • link 2.5 years ago by Pierre Lindenbaum 165k