short answer:
You take the list of genes from the probes used for capture (all vendors provide them) and use biomart to get all the annotations you have mention.
Long:
The definition of clinical exome is vage and confusing, and in my opinion should not be use.
First you need to choose what means clinical exome to you:
a) the whole exome? Then the atribute "clinical" does not mean anything, perhaps it only states that is going to be used in a diagnostic environment and probably the threshold for base coverage would be over 40.
b) You do an exome but only analyze the clinical relevant genes (400-7000, the ones in OMIM, HGMD....)
- you can do a whole exome but only describe mutation in those genes (and only in the core refseq transcripts)
- you use some ad-hoc exome capture sets like agilent focus-exome or illumina TruSightOne that constains only probes for these genes and you save in secuencing.
c) It is the whole exome but the clinical relevant genes has improved capture and an effort has been made to achieve maximum coverage on them, like Agilent Sure Select Clinical Research V2
Now that you have chosen your capture set, ¿How to obtain the genes and data you want?
The list of genes are usually stored in files called manifest, or files in bed format. You can obtain Illumina TruSightOne from here, and the agilents ones from SureDesign register yourself there, go to "find_design" tab, select "SureSelect DNA", and click in "Agilent Catalog" tab.
I you don't have bionformatics knowledge you can extract the gene column in Excel and use biomart (filters-> gene-> imput external references [select "gene name" from the options]) and paste them in groups of 500;
I feel that the term 'clinical exome' is very misleading and very ambiguous.
Clinical Exome Sequencing (CES) is merely exome sequencing, i.e., the sequencing of protein-coding genes. It is very misleading to call it 'clinical' because it has been shown time and time again that even intergenic mutations can play key roles in disease, even fully explain a disease mechanism in some cases.
See UCLA's definition here: http://pathology.ucla.edu/clinical-exome-sequencing
I was wondering if CES is somehow a subset of WES. Not sure how anyone got the 3000 number. Ideally, one would go (by how confident one is) fro a targeted gene capture -> whole exome -> whole genome. Maybe the targeted part is what's being called the CES? In which case, CES would refer to the set of all genes that could contribute to a specific phenotype, and could be used on a slightly large scale for personalized genomics.
I think of CES as exome sequencing done for clinical research/diagnostic purpose. I may do only 20 genes and call it a clinical exome, if my interests are limited to those genes.
My thoughts exactly.
Yes, but in many cases, studies just show an association between a variant in a given exon and a disease, i.e., there's no concrete proof.
Just for the OP to expand on this. It really does not appear to be any different from standard exome sequencing, although one must be aware, in this sense, that different exome-seq capture kits target different regions and are thus sequencing different numbers of genes.
For the CES, which is validated by CLIA according to the authors in their publish manuscript HERE, they state:
AKA exonic variants. SMH