Hi, I am working on the CCLE cell line mutation data. The mutation data is a summarized list, with each line representing a specific mutation in specific cell as well as other annotations including:
Hugo_Symbol Entrez_Gene_Id NCBI_Build Chromosome Start_position End_position Strand Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 dbSNP_RS dbSNP_Val_Status Genome_Change Annotation_Transcript Tumor_Sample_Barcode cDNA_Change Codon_Change Protein_Change isDeleterious isTCGAhotspot TCGAhsCnt isCOSMIChotspot COSMIChsCnt ExAC_AF WES_AC WGS_AC SangerWES_AC SangerRecalibWES_AC RNAseq_AC HC_AC RD_AC
now I want to give a binary value or a probability score to denote whether a mutation is driver in this specific cell. I have checked many post and papers, but their method are all based on original sequencing data rather that such a mutation list. Any idea about how to do this? Many thanks in advance!
Hi, currently I am interested in 42 cell lines with each cell line harbouring several hundred or over one thousand mutations, all of them located in coding region and germline mutations were filtered. I understand that most germline mutations are not driver mutations, but all those mutations in my list I am wondering how to decide if they are driver or not.
I am new to this area so please point out if I am wrong. My understand is that a gene harboring driver mutations is a driver gene, but a driver gene can also harbor passenger mutations. Also a mutation being "driver" in one cell doesn't mean it is also driver in other cell. When you say "how frequently the gene is mutated", do you mean "how many cancer patients carrying the mutation"? So technically if a mutation is carried only by all the cancer patients but not by the individuals from healthy cohort, then we can call this mutation a driver?