Question

Cosmic And 1000 Genomes Variants

0

Entering edit mode

12.9 years ago

learnerforever ▴ 520

Is it possible that COSMIC and 1000 genomes have some common variants? In that case, how would one interpret those if 1000 genome data is supposedly from normal people only?

Thanks, forum!

1000genomes variant • 4.2k views

ADD COMMENT • link updated 12.7 years ago by Jorge Amigo 14k • written 12.9 years ago by learnerforever ▴ 520

score 3 · Answer 1 · 2012-05-27

this is a conceptual doubt, and even though it has been perfectly addressed by JC, I couldn't help writing a few lines about it. my motivation, which I constantly do with my colleagues, is to try to elaborate previous hypothesis before dealing with data, because once you have the data you will always try to give a suitable explanation that would surely be biased (Sherlock Holmes dixit). so, aside from all the errors that any of these projects could have (in fact, ultrasequencing has a known error of ~1/1000 bases, very close to Sanger sequencing indeed, which we still have to live with it), you have to bear in mind how these project were conceived before handling their data, moreover if you are planning to combine them.

1000g has to do with finding population variation at very low frequency through all the genome, and in fact one of the main aims of the project was to give some help to the researchers that are currently coming from the commonvariant-commondisease hypothesis (high throughput genotyping, GWAs,...) to investigate the role of rare variants in complex disorders (in general, but of course including cancer). for that reason, the type of variation you would expect to find in it is common variation previously described in dbSNP and rare variation, not necessarily being malign at all. we all carry particular mutations that may be of clinical interest or not, but the aim of the project is not to detect those particular mutations, but the ones that could have been fixated in the population at a sufficient frequency to study them deeper.

COSMIC, on the other hand, aims to find only cancer causative mutations, although in this process they have to ultrasequence the entire exome and then decide which of the detected variants are indeed causative and which aren't. the way they do this is in essence by comparing variants obtained from normal cells with the ones obtained from tumor cells, and deeply study what they don't have in common. this is a very tough process, because they really can't discard all common variation, so the variant selection for this catalog still has to be thoroughly studied, and for that reason this catalog would be never a "yes, all that is, it's in, and all that is not has been discarded" resource, but a very interesting source of highly potential candidates for the variations one may encounter in daily routine.

finally, a note should be made regarding the effect of variants, specially when talking about complex diseases. it is a common thought that a single variant may be the cause of a particular disease. even some people that know that certain combinations of variants may be the cause of a particular disease tend to forget that this doesn't mean that each variation of that combination should be considered as a malign mutation per se, but that due to the gene changes that it may cause it can modify for instance the expression level of a certain gene, and this combined with other modified expressions that wouldn't necessarily be considered as pathogenic could all lead to a final unhealthy phenotype. the problem is that we bioinformaticians are used to search variant databases expecting binary results (is pathogenic or is not), but unfortunately when talking about complex diseases we are always in great ambiguity which is very difficult even to program and design. this would be, in my opinion, the main source of coincidences between a project that tries to discover rare variation and another that tries to investigate to the latest extent the clinical meaning of all variation found on a complex disease.

score 2 · Answer 2 · 2012-05-21

2

Entering edit mode

12.9 years ago

JC 13k

Yes, I think is possible. (If I'm right, otherwise please correct me) COSMIC reports SNPs identified in cancer samples with high prevalence between samples and a possible impact in a cancer-driven genes. But not all SNPs will be malign and are shared because are common SNPs in the population. In the other hand, there are some chance of really malign SNPs are in the 1000G samples, but are silent or inactive (for now) because cancer is a complex disease.

ADD COMMENT • link 12.9 years ago by JC 13k

2

Entering edit mode

This is correct. It is worth reading the info on the Cosmic website about how their database is produced http://www.sanger.ac.uk/genetics/CGP/cosmic/add_info/ There may be errors in both data sets. It may be that some somatic mutations are not drivers of cancer but acquired during cancer development and as such match normal human variation found in other individuals

ADD REPLY • link 12.9 years ago by Laura ★ 1.8k