Hi guys, I am using mutation data in ICGC. There is a mutation type named "multiple base substitution". Please refer to an example at https://dcc.icgc.org/mutations/MU63993950
In a nutshell, this record says
at chr21:g.10848413,
AATCAAAAGGAATGGAATGGAATTTAATTGAATGGAATCTAAAGGAATG
mutates to
ACTCGAAAGGAGTGGAATGGAATCTAAAGGAAAT
The length of the sequence differs before and after mutation. I am wondering why substitution changes the length of sequence. Besides, why the mutated site is a single base, i.e.chr21:g.10848413 ?
So, can anybody help to explain what "multiple base substitution" is? Thanks a lot.
Arvin
Thanks Devon. Then it's confusing why ICGC uses such a name. I searched in google and google scholar but cannot find details about this.
I agree that their naming is a bit odd. What they're trying to convey is that there are stretches similar to the reference with bases occasionally missing or different. I suppose just calling this an indel would lose that distinction.