Explanation Of Fields In Watson Snp Gff File
1
2
Entering edit mode
13.9 years ago
Andrea_Bio ★ 2.8k

Hello

Here are a few lines from the watson snp data in gff format:

#chr1    JW    genotype    42101    42101    .    +    .    SNP rs2691277.1;alleles T/G;ref_allele T;ref_counts 0;oth_counts 1
#chr1    JW    genotype    45408    45408    .    +    .    SNP rs28396308;alleles C/T;ref_allele C;ref_counts 0;oth_counts 3
#chr1    JW    gt_novel    41921    41921    .    +    .    SNP BJW-1117373;alleles G/C;ref_allele G;ref_counts 2;oth_counts 2

What does the following mean:

a) a dbSNP id with a period in it: rs2691277.1. There is no such entry in dbSNP for rs2691277.1 but there is one for rs2691277
b) What does this mean: BJW-1117373

Many thanks

snp • 2.9k views
ADD COMMENT
2
Entering edit mode
13.9 years ago

Looks to me like the BJW SNP is novel. Watson had a huge number of (then) novel SNPs in his genome. I also do not understand the ".1" tag to that SNP. I would see if rs2691277 maps to same position as given in your example. If so, you probably can ignore the .1 suffix. Or, is there a rs26912771 SNP on chr1? If so, maybe there is a typo in the data --> headache!

ADD COMMENT
0
Entering edit mode

thanks. how do i know which strand the snps are on? Can i assume they are forward strand as a strand isn't given? At present I'm about to look at the snps manually to see if i can work out the strand

ADD REPLY
0
Entering edit mode

i did check the mapping on the snp before i posted; probably should have told you rs2691277 says it is only umbigulously mapped to a non reference genome. I have just checked and there is no snp with id 26912771

ADD REPLY
0
Entering edit mode

I just figured out a possible meaning of "BJW." B is for Baylor College of Medicine where the sequencing was done. JW is Jim Watson.

ADD REPLY
0
Entering edit mode

how do i know which assembly the snps are mapped to?

ADD REPLY
0
Entering edit mode

also, why is the number of snps so different between dbSNP and the gff download. The gff download has 2060544 mapped snps whereas dbSNP has 3.2 million mapped snps. when you query dbSNP for mapped snps, you can get results for snps that map to an assembly other than the reference. This could account for some of the additional snps but I don't think it can account for a million snps can it?

ADD REPLY

Login before adding your answer.

Traffic: 2808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6