Problems running Genome MuSiC Play
1
2
Entering edit mode
10.1 years ago
mkaushal ▴ 120

Hello All,

I am trying to use Genome MuSiC playto get a list of significantly mutated somatic variants. I have normal, primary tumor and metastasized tumor samples from each patient. I have vcfs from Varscan (somatic) and I have converted them to maf using the tool vcf2maf: https://github.com/ckandoth/vcf2maf

I am getting the following error when I tried running MuSiC with one of my samples:

Running genome music proximity...
Could not find required additional columns in um7b_mpileup_varscan_somatic.snpeff.maf
ERROR: Command genome music proximity did not return a true value.

I have checked the maf file specifications here : Are There Any Caveats When Creating A Maf File For Music?. I have the first 34 columns in proper order and also have provided the additional columns required by tools cosmic and proximity after column 34. Here are my column headers:

Hugo_Symbol     Entrez_Gene_Id  Center  NCBI_Build      Chromosome      Start_Position  End_Position    Strand  Variant_Classification  Variant_Type    Reference_Allele      Tumor_Seq_Allele1       Tumor_Seq_Allele2       dbSNP_RS        dbSNP_Val_Status        Tumor_Sample_Barcode    Matched_Norm_Sample_Barcode  Match_Norm_Seq_Allele1   Match_Norm_Seq_Allele2  Tumor_Validation_Allele1        Tumor_Validation_Allele2        Match_Norm_Validation_Allele1   Match_Norm_Validation_Allele2 Verification_Status     Validation_Status       Mutation_Status Sequencing_Phase        Sequence_Source Validation_Method       Score   BAM_File      Sequencer       Tumor_Sample_UUID       Matched_Norm_Sample_UUID        HGVSc   HGVSp   Transcript_ID   transcript_name Exon_Number     t_depth t_ref_count   t_alt_count     n_depth n_ref_count     n_alt_count     all_effects     Effect  Effect_Impact   Functional_Class        Codon_Change    amino_acid_change     Amino_Acid_Length       strand  Gene_Name       Transcript_BioType      Gene_Coding     Transcript_ID   Exon_Rank       Genotype_Number ERRORS  WARNINGS

In order:

  1. Hugo_Symbol
  2. Entrez_Gene_Id
  3. Center
  4. NCBI_Build
  5. Chromosome
  6. Start_Position
  7. End_Position
  8. Strand
  9. Variant_Classification
  10. Variant_Type
  11. Reference_Allele
  12. Tumor_Seq_Allele1
  13. Tumor_Seq_Allele2
  14. dbSNP_RS
  15. dbSNP_Val_Status
  16. Tumor_Sample_Barcode
  17. Matched_Norm_Sample_Barcode
  18. Match_Norm_Seq_Allele1
  19. Match_Norm_Seq_Allele2
  20. Tumor_Validation_Allele1
  21. Tumor_Validation_Allele2
  22. Match_Norm_Validation_Allele1
  23. Match_Norm_Validation_Allele2
  24. Verification_Status
  25. Validation_Status
  26. Mutation_Status
  27. Sequencing_Phase
  28. Sequence_Source
  29. Validation_Method
  30. Score
  31. BAM_File
  32. Sequencer
  33. Tumor_Sample_UUID
  34. Matched_Norm_Sample_UUID
  35. HGVSc
  36. HGVSp
  37. Transcript_ID
  38. transcript_name
  39. Exon_Number
  40. t_depth
  41. t_ref_count
  42. t_alt_count
  43. n_depth
  44. n_ref_count
  45. n_alt_count
  46. all_effects
  47. Effect
  48. Effect_Impact
  49. Functional_Class
  50. Codon_Change
  51. amino_acid_change
  52. Amino_Acid_Length
  53. strand
  54. Gene_Name
  55. Transcript_BioType
  56. Gene_Coding
  57. Transcript_ID
  58. Exon_Rank
  59. Genotype_Number
  60. ERRORS
  61. WARNINGS

I am not exactly sure what I am missing here. Can someone please help?

music software-error • 3.4k views
ADD COMMENT
0
Entering edit mode
10.1 years ago

I made a mistake in the post "Are There Any Caveats When Creating A Maf File For Music?". Rather than strand, you actually need c_position. So just rename Codon_Change in your MAF to c_position.

Also note that "genome music play" tries to run each tool in the MuSiC suite one-by-one, but it breaks if any of the tools fail. So try running the tools separately starting with bmr calc-covg, bmr calc-bmr, smg, pathscan, etc.

ADD COMMENT
0
Entering edit mode

I took your advice and ran individual modules separately. After some minor hiccups I could finally run the bmr calc-covg, bmr calc-bmr and smg. I have a list of smgs now for my samples. I wanted to know if there is a way to correlate the smgs to the sample data? For example this is a gene record from the smg file:

#Gene     Indels     SNVs     Tot Muts     Covd Bps     Muts pMbp     P-value FCPT     P-value LRT     P-value CT     FDR FCPT     FDR LRT     FDR CT
ERCC5     0          21       21           2058         10204.08      1.04E-11         2.94E-11        9.24E-15       1.53E-08     3.75E-08    1.18E-11

Is there a way to know which samples had variants from ERCC5? Off course I can look into my maf file and figure that out, I was wondering is there some other way out like a summary table?

Thanks a lot for your help.

ADD REPLY
1
Entering edit mode

The MuSiC suite includes a tool called mutation-relation that looks for significant concurrence or mutual exclusivity of mutated genes. This tool takes a MAF as input, and has an option to save a mutation-matrix-file, a sample-vs-gene matrix. More info down here.

ADD REPLY
0
Entering edit mode

Hi mkaushal,

Could you please post your command how you ran the individual modules bmr calc-covg, bmr calc-bmr and smg. How did you prepare the data ?

I made ROI, maf, and List to run calc-covg, but keep getting error, which I have described here: who can give me the data of genome music bmr calc-covg

ADD REPLY
2
Entering edit mode

Hi,

Commands I used:

#genome music calculate bmr covg step
genome music bmr calc-covg --bam-list tumor_bam_list.txt --output-dir results --reference-sequence hg19.fa --roi-file hg19_coding_exons_plus10_exclude.bed --cmd-prefix qsub --gene-covg-dir results

#genome music calculate bmr step
genome music bmr calc-bmr --bam-list tumor_bam_list.txt --maf-file varscan_snp_tumor_edited.maf --output-dir results --reference-sequence hg19.fa --roi-file hg19_coding_exons_plus10_exclude.bed --bmr-output results --gene-mr-file results

#smg
genome music smg --gene-mr-file results/gene_mrs --output-file results/smgs_varscan_tumor

Things to remember:

  1. After the final maf files are created change the maf files for running Music:

    1. https://www.biostars.org/p/81073/
    2. transcript_name column to be inserted after Transcript_ID
    3. change came of column Codon_Change to c_position
    4. change column name of Amino_Acid_Change to amino_acid_change
    5. enter column strand after column Amino_Acid_Length

    Note that the column names mentioned above should all be in lower case

  2. Prepare a bam list with all bam files

    1. Tab delimited file
    2. Format : sample name, normal bam, tumor bam

    Make sure all bam index files are also present in respective directories. Make sure there is no extra new lines in the bam list file

  3. The ROI file is one based

  4. All ref files should be sorted

Hope this helps.

ADD REPLY
1
Entering edit mode

Thanks mkaushal !!

I am currently running "genome music smg ", and hope it should run without further problem.

Also thanks for the infor "things to remember"

Also plan to upload a detailed tutorial on it.

Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6