Hello All,
I am trying to use Genome MuSiC play
to get a list of significantly mutated somatic variants. I have normal, primary tumor and metastasized tumor samples from each patient. I have vcfs from Varscan (somatic) and I have converted them to maf using the tool vcf2maf
: https://github.com/ckandoth/vcf2maf
I am getting the following error when I tried running MuSiC with one of my samples:
Running genome music proximity...
Could not find required additional columns in um7b_mpileup_varscan_somatic.snpeff.maf
ERROR: Command genome music proximity did not return a true value.
I have checked the maf file specifications here : Are There Any Caveats When Creating A Maf File For Music?. I have the first 34 columns in proper order and also have provided the additional columns required by tools cosmic
and proximity
after column 34. Here are my column headers:
Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome Start_Position End_Position Strand Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS dbSNP_Val_Status Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Match_Norm_Seq_Allele1 Match_Norm_Seq_Allele2 Tumor_Validation_Allele1 Tumor_Validation_Allele2 Match_Norm_Validation_Allele1 Match_Norm_Validation_Allele2 Verification_Status Validation_Status Mutation_Status Sequencing_Phase Sequence_Source Validation_Method Score BAM_File Sequencer Tumor_Sample_UUID Matched_Norm_Sample_UUID HGVSc HGVSp Transcript_ID transcript_name Exon_Number t_depth t_ref_count t_alt_count n_depth n_ref_count n_alt_count all_effects Effect Effect_Impact Functional_Class Codon_Change amino_acid_change Amino_Acid_Length strand Gene_Name Transcript_BioType Gene_Coding Transcript_ID Exon_Rank Genotype_Number ERRORS WARNINGS
In order:
- Hugo_Symbol
- Entrez_Gene_Id
- Center
- NCBI_Build
- Chromosome
- Start_Position
- End_Position
- Strand
- Variant_Classification
- Variant_Type
- Reference_Allele
- Tumor_Seq_Allele1
- Tumor_Seq_Allele2
- dbSNP_RS
- dbSNP_Val_Status
- Tumor_Sample_Barcode
- Matched_Norm_Sample_Barcode
- Match_Norm_Seq_Allele1
- Match_Norm_Seq_Allele2
- Tumor_Validation_Allele1
- Tumor_Validation_Allele2
- Match_Norm_Validation_Allele1
- Match_Norm_Validation_Allele2
- Verification_Status
- Validation_Status
- Mutation_Status
- Sequencing_Phase
- Sequence_Source
- Validation_Method
- Score
- BAM_File
- Sequencer
- Tumor_Sample_UUID
- Matched_Norm_Sample_UUID
- HGVSc
- HGVSp
- Transcript_ID
- transcript_name
- Exon_Number
- t_depth
- t_ref_count
- t_alt_count
- n_depth
- n_ref_count
- n_alt_count
- all_effects
- Effect
- Effect_Impact
- Functional_Class
- Codon_Change
- amino_acid_change
- Amino_Acid_Length
- strand
- Gene_Name
- Transcript_BioType
- Gene_Coding
- Transcript_ID
- Exon_Rank
- Genotype_Number
- ERRORS
- WARNINGS
I am not exactly sure what I am missing here. Can someone please help?
I took your advice and ran individual modules separately. After some minor hiccups I could finally run the
bmr calc-covg
,bmr calc-bmr
andsmg
. I have a list of smgs now for my samples. I wanted to know if there is a way to correlate the smgs to the sample data? For example this is a gene record from the smg file:Is there a way to know which samples had variants from ERCC5? Off course I can look into my maf file and figure that out, I was wondering is there some other way out like a summary table?
Thanks a lot for your help.
The MuSiC suite includes a tool called
mutation-relation
that looks for significant concurrence or mutual exclusivity of mutated genes. This tool takes a MAF as input, and has an option to save amutation-matrix-file
, a sample-vs-gene matrix. More info down here.Hi mkaushal,
Could you please post your command how you ran the individual modules
bmr calc-covg
,bmr calc-bmr and smg.
How did you prepare the data ?I made ROI, maf, and List to run
calc-covg
, but keep getting error, which I have described here: who can give me the data of genome music bmr calc-covgHi,
Commands I used:
Things to remember:
After the final maf files are created change the maf files for running Music:
Note that the column names mentioned above should all be in lower case
Prepare a bam list with all bam files
Make sure all bam index files are also present in respective directories. Make sure there is no extra new lines in the bam list file
The ROI file is one based
Hope this helps.
Thanks mkaushal !!
I am currently running "genome music smg ", and hope it should run without further problem.
Also thanks for the infor "things to remember"
Also plan to upload a detailed tutorial on it.
Thanks!