Forum:What does "integrative call set" mean to you?
5
3
Entering edit mode
8.6 years ago
Emily 24k

I am asking a question as a favour for an open variant data project, which I won't name (it's not Ensembl!). They use the phrase "integrated call set" to describe some of their flat datasets, but we're not actually sure if most people know what it means. Could people please tell me:

  • Do you already know what it means? Is there another word or phrase that you think would describe it better?
  • If you don't know what it means, what would you guess?

I will come back and give full disclosure about who this is for and what it actually means once I have a few responses. Sorry for being a bit sneaky and secretive, we find that the less information we give you, the more open your answers will be.

UPDATE: Thanks all for your help. The terminology comes from 1000 Genomes (http://www.1000genomes.org/), currently being brought into the larger IGSR project, which will accept further open population genome studies. It's used to mean the complete set of variants from a particular stage of the project, from different variant calling methods, for example, all the variants from phase 3.

If you have suggestions for better terminology, we will look into this. We're trying to avoid words like "final", as that's just inviting mistakes and necessary updates.

dataset variant • 2.8k views
ADD COMMENT
5
Entering edit mode

It sounds to me that they used multiple variant calling methods and integrated their outputs afterwards to make the final call set.

ADD REPLY
0
Entering edit mode

Agreed - anything else is likely to be "comparative" or "summation" but not truly "integrated"

ADD REPLY
0
Entering edit mode

I am not a native speaker, but maybe given the intended meaning I propose the following alternatives:

  • union set (that is maybe most correct in the mathematical sense, in the end this is set theory)
  • joint variant calls
  • compound variant calls
  • combined
  • merged variant calls

There is nothing wrong with integrated, meaning the same as the terms above ('whole' by etymology), but it has been used in bioinformatics in the terms 'data integration', 'integrative bioinformatics' where it means bringing together very heterogeneous data types, while the variant calls are mostly something very similar with only different origin.

ADD REPLY
1
Entering edit mode
8.6 years ago

A database of variants from different genome projects which visualizes them on genes where the user can sort+filter variants/structural variants based on where they are coming from - their scores - other properties etc?

ADD COMMENT
1
Entering edit mode
8.6 years ago
pld 5.1k

I would assume that it was a variant call set made from the sum of variant calls collected from other sources. My underlying assumption is that the goal of such an undertaking would be to generate higher quality (with respect to currently available sources) call data using meta-analysis or some form of curation.

Additionally, as others have said, I would think that the calls ending up in the integrated sets would have inherited various annotations from the input sources, and that these annotations would have been processed through a similar meta-analysis/curation mechanism resulting in greater precision and confidence.

ADD COMMENT
0
Entering edit mode
8.6 years ago
A. Domingues ★ 2.7k

No idea, but would:

"integrated call set"

Mean annotated with other sources of information? Expression data, GO, etc.

ADD COMMENT
0
Entering edit mode
8.6 years ago
Michael 55k

I haven't heard that before, I would guess, given the context of variants (therefore variant calls):

Variant calls enriched with additional annotation from other sources (integrative bioinformatics), e.g. pathways, GO, functional prediction, gene expression measurements, phenotypes, etc.

ADD COMMENT
0
Entering edit mode
8.6 years ago
ivivek_ngs ★ 5.2k

I have read this term somewhere during my days of variant discovery analysis. But it was not a generic term. Whatever I understood is if you can map your variants to different features and call sets and give a global picture in terms of associations, annotations, enrichment , pathways, dysregulation , disease specific dysregulations , mapping variants to eQTL studies, protein dysregulation (downstream functional impacts). So in all its the number of features and attributes that can be associated to variants giving a more comprehensive map. I could not find much now but will update more. Probably i read it somewhere in Broad GATK blogs and also on tools that tries to give much larger view of variants annotation and functional association.

Yes as I said, I read it more when the variant studies tries to minimize the false positive rates in the variant calls by associating them with features or having Ti/Tv ratios higher for that particular genome from the variants indicating less error rates in the variant calls.

ADD COMMENT

Login before adding your answer.

Traffic: 1594 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6