Indel Notation In Variant Calling
1
1
Entering edit mode
13.6 years ago
Andrea_Bio ★ 2.8k

Hello

I am sorry for the basic question but I am struggling to find any details of the nomenclature of indels by variant calling software. Unfortunately i am unable to access the details of the software used for the variants at this moment but I imagine the nomenclature will be fairly standard. I tried looking in a bioscope guide but this wasn't explained so I assume it is so obvious to those in the field that it does not need explaining. However i'm not from the field and haven't worked with indels before, only snps.

Ref     Genotype
C       */-G
T       +C/*

What does the star mean here and why is it sometimes before the / or after the /

also what does it mean when the reference allele is a * or a N

thanks a lot

indel • 5.0k views
ADD COMMENT
0
Entering edit mode

Which package is emitting these calls? That doesn't conform to the VCF 4.0 format as I understand it (http://www.1000genomes.org/node/101)

ADD REPLY
0
Entering edit mode

i think its bioscope but i can't get hold of the data provider at present hence my problem :(

ADD REPLY
4
Entering edit mode
13.6 years ago
Drio ▴ 920

It indicates you have a deletion (first) or insertion (second) with respect your reference genome. The '*' indicates one of the genotypes matches your reference genome. A good way to understand and confirm all these is to look at the alignments by eye (check broads' igv or my favorite samtools tview).

ADD COMMENT
0
Entering edit mode

thanks for your answer. i don't have any alignments to check by eye. just this data. why is the * sometimes before or after the /? If * means one of the genotypes matches the reference, why don't they include the reference allele in the genotype instead. what does it mean when the reference allele is * or N

ADD REPLY
0
Entering edit mode

thanks for answer. i dont have any alignments to check otherwise i could have worked it out from them :) I just have this data sadly. why is the * sometimes before or after the slash? Does the order mean anything? what does it mean when the reference is a * or an N?

ADD REPLY
0
Entering edit mode

I don't think the order is relevant (check other fields). If there is an N it means the reference genome did not have any nucleotide at that position. An * in the reference would indicate there is an homozygous insertion with respect the reference.

ADD REPLY
0
Entering edit mode

ok so something like this ref genotype
/-TT means both alleles had a TT deletion and this +C/ means both alleles had a C inserted (order doesn't matter). What notation is this?

ADD REPLY
0
Entering edit mode

ok so something like this (ref genotype) * /-TT means both alleles had a TT deletion and this * +C/ means both alleles had a C inserted (order doesn't matter). What notation is this and what package is it created by?

ADD REPLY
0
Entering edit mode

sorry one more quick thing, what does it mean when a snp has * for the reference e.g. (ref genotype) * T or * Y

ADD REPLY
0
Entering edit mode

if * for the reference means a homozygous insertion, then i don't understand this ref/genotype which is a deletion * -CCCC/-CCCCC. also what woudl this ref/genotype * */-C mean

ADD REPLY
0
Entering edit mode

Take this genotype: T -ACTC/ where (T) is the reference and (-ACTC/) is the genotype. My best guess is the ref = T and there were 2 alleles observed, one the same as the ref, , and one a deletion. Take this genotype: * -ACTC/ where * is the ref and -ACTC/ is the genotype. Based on your information this genotype should be a homozygous deletion wrt the reference and have one allele the same as the reference. Those 2 conditions are mutually exclusive

ADD REPLY

Login before adding your answer.

Traffic: 2261 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6