Hi,
I'm working with exome sequencing data from very early tumours and would like to identify driver genes. I've come across the SomInaClust package that seems well-suited for my needs but am having problems making it recognise my MAF files. So far, I've used maftools to read MAF files into R and that has always worked without any issues. However, when I try to use these MAF files with SomInaClust it produces an error message as it doesn't recognise the columns. I've put in an example below (created by reading in a breast cancer MAF file that comes with the SomInaClust package. This MAF works well with any maftools function but fails with SomInaClust).
I'm a very rookie R user and I'm convinced that it's just a tiny tweak that'll make this run but just cannot seem to make it work. How do I make SomInaClust recognise the columns of the input MAF? Any help is highly appreciated. Thanks a lot!
library(maftools)
library(SomInaClust)
brca <- read.maf(maf_brca) #example MAF from the SomInaClust package
#run an example command using maftools
brca_oncodriveClust <- oncodrive(brca, AACol = "amino_acid_change_WU", minMut = 5, pvalMethod = "zscore")
Estimating background scores from synonymous variants.. Not enough genes to build background. Using predefined values. (Mean = 0.279; SD = 0.13) Estimating cluster scores from non-syn variants.. |=====================================================================================================================================================| 100% Comparing with background model and estimating p-values.. Done !
getFields(brca)
[1] "Hugo_Symbol" "Entrez_Gene_Id" "Center" "NCBI_Build"
[5] "Chromosome" "Start_Position" "End_Position" "Strand"
[9] "Variant_Classification" "Variant_Type" "Reference_Allele" "Tumor_Seq_Allele1"
[13] "Tumor_Seq_Allele2" "dbSNP_RS" "dbSNP_Val_Status" "Tumor_Sample_Barcode"
[17] "Matched_Norm_Sample_Barcode" "Match_Norm_Seq_Allele1" "Match_Norm_Seq_Allele2" "Tumor_Validation_Allele1"
[21] "Tumor_Validation_Allele2" "Match_Norm_Validation_Allele1" "Match_Norm_Validation_Allele2" "Verification_Status"
[25] "Validation_Status" "Mutation_Status" "Sequencing_Phase" "Sequence_Source"
[29] "Validation_Method" "Score" "BAM_File" "Sequencer"
[33] "Tumor_Sample_UUID" "Matched_Norm_Sample_UUID" "Chromosome" "start_WU"
[37] "stop_WU" "reference_WU" "variant_WU" "type_WU"
[41] "gene_name_WU" "transcript_name_WU" "transcript_species_WU" "transcript_source_WU"
[45] "transcript_version_WU" "strand_WU" "transcript_status_WU" "trv_type_WU"
[49] "c_position_WU" "amino_acid_change_WU" "ucsc_cons_WU" "domain_WU"
[53] "all_domains_WU" "deletion_substructures_WU" "transcript_error"
brca_Sominaclust <- SomInaClust_det(brca, calculate_CDS = TRUE, convert_genenames_to_HGNC=FALSE) #trying the file as input for SomInaClust
Error in SomInaClust(maf = maf, database = database, define_clustersize = FALSE, : Make sure the following columns are present in the maf file (column names need to be exact): Hugo_Symbol, Tumor_Sample_Barcode, Variant_Classification, Start_Position
sessionInfo()
R version 3.4.0 (2017-04-21)
attached base packages: [1] parallel stats graphics grDevices utils datasets methods base
other attached packages: [1] SomInaClust_1.0.0 maftools_1.2.30 Biobase_2.36.2 BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] nlme_3.1-131 bitops_1.0-6 matrixStats_0.52.2 bit64_0.9-7 doParallel_1.0.11
[6] RColorBrewer_1.1-2 prabclus_2.2-6 GenomeInfoDb_1.12.3 tools_3.4.0 R6_2.2.2
[11] DBI_0.7 lazyeval_0.2.1 colorspace_1.3-2 trimcluster_0.1-2 nnet_7.3-12
[16] GetoptLong_0.1.6 gridExtra_2.3 bit_1.1-12 compiler_3.4.0 DelayedArray_0.2.7
[21] pkgmaker_0.22 labeling_0.3 slam_0.1-40 rtracklayer_1.36.6 diptest_0.75-7
[26] scales_0.5.0 DEoptimR_1.0-8 mvtnorm_1.0-6 robustbase_0.92-8 NMF_0.20.6
[31] stringr_1.2.0 digest_0.6.12 Rsamtools_1.28.0 cometExactTest_0.1.3 XVector_0.16.0
[36] pkgconfig_2.0.1 changepoint_2.2.2 BSgenome_1.44.2 rlang_0.1.4 GlobalOptions_0.0.12
[41] RSQLite_2.0 shape_1.4.3 bindr_0.1 zoo_1.8-0 mclust_5.4
[46] BiocParallel_1.10.1 DPpackage_1.1-7.1 dendextend_1.6.0 dplyr_0.7.4 VariantAnnotation_1.22.3
[51] RCurl_1.95-4.8 magrittr_1.5 modeltools_0.2-21 GenomeInfoDbData_0.99.0 wordcloud_2.5
[56] Matrix_1.2-11 Rcpp_0.12.14 munsell_0.4.3 S4Vectors_0.14.7 viridis_0.4.0
[61] stringi_1.1.6 whisker_0.3-2 MASS_7.3-47 SummarizedExperiment_1.6.5 zlibbioc_1.22.0
[66] flexmix_2.3-14 plyr_1.8.4 grid_3.4.0 blob_1.1.0 ggrepel_0.7.0
[71] lattice_0.20-35 cowplot_0.9.1 Biostrings_2.44.2 splines_3.4.0 GenomicFeatures_1.28.5
[76] circlize_0.4.2 ComplexHeatmap_1.14.0 GenomicRanges_1.28.6 rjson_0.2.15 fpc_2.1-10
[81] rngtools_1.2.4 biomaRt_2.32.1 reshape2_1.4.2 codetools_0.2-15 stats4_3.4.0
[86] XML_3.98-1.9 glue_1.2.0 data.table_1.10.4-3 foreach_1.4.3 gtable_0.2.0
[91] kernlab_0.9-25 assertthat_0.2.0 ggplot2_2.2.1 gridBase_0.4-7 xtable_1.8-2
[96] class_7.3-14 survival_2.41-3 viridisLite_0.2.0 tibble_1.3.4 iterators_1.0.8
[101] GenomicAlignments_1.12.2 AnnotationDbi_1.38.2 registry_0.5 memoise_1.1.0 IRanges_2.10.5
[106] bindrcpp_0.2 cluster_2.0.6
Hi, You're passing MAF object as an input for
SomInaClust_det
function. My guess is it requires MAF file as an input. May be try this and see...This gives the following error message.
Hi, Could you solve your problems of SomInaClust package? I try to use SomInaClust but I got same errors. I could not solve "argument "parameter_default" is missing, with no default" error. Could you please share your fixed commands?