Create mutation catalogue function giving GRange object errors
1
0
Entering edit mode
4.7 years ago
kittys • 0

Hi, I am trying to analyse indel mutations using the R package YAPSA. My data is in the input format required, with columns chrom, pos, ref, alt and patient ID. I am trying to use the function create_mutation_catalogue_from_df with this data using the GRCh38 of the genome. My command is:

indel_catalogue <- create_mutation_from_df(indels_df, this_seqnames.field = "chrom", this_start.field = "pos", this_end.field = "pos", this_PID.field = "patient_ID", this_refGenome = BSgenome.Hsapiens.NCBI.GRCh38, this_verbose = 1)

The error I get is: Error in validObject(.Object) : invalid class "GRanges" object : 'seqnames(x)' contains missing values

I also tried adding: this_refGenome_Seqinfo = seq_info_GRCh38 where seq_info_GRCh38 <- SeqinfoForBSGenome(BSgenome.Hapsiens.NCBI.GRCh38)

I tried the translate_to_hg19 function before running the command with no success either.

I also tried this with the BSgenome.Hsapiens.UCSC.hg38 genome which returns the error error in ans[] <- x : replacement has length zero

Any ideas on how to fix this? Any suggestions would be hugely appreciated!! Thank you :)

YAPSA GenomicRanges R indels • 1.1k views
ADD COMMENT
0
Entering edit mode
2.5 years ago
Marc • 0

Hi, which version of YAPSA are you using? The current Bioconductor version of YAPSA creates mutational catalogues from INDELs with the function create_indel_mutation_catalogue_from_df. As far as I can see, the sequence context is collected from BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19 and cannot currently be changed (a separate feature request at https://support.bioconductor.org/tag/YAPSA/) could bring attention to this. I generally recommend to rename the columns to "CHROM", "POS", "REF", "ALT" and "PID", because some of the YAPSA functions expect these names. The following example illustrates the creation of a mutational catalog from INDELs:

> suppressPackageStartupMessages(library("YAPSA"))
> data(sigs_pcawg)
> data(GenomeOfNl_raw) ## only required for example
> 
> INDEL_df <- translate_to_hg19( GenomeOfNl_raw[1:200,], "CHROM" )[,c("CHROM","POS","REF","ALT")]
> INDEL_df$PID <- c("PID1","PID2","PID3","PID4","PID5")
> head(INDEL_df)
   CHROM   POS   REF ALT  PID
32  chr2 12320   AAT   A PID1
53  chr2 13613    AG   A PID2
92  chr2 17012 TAAAG   T PID3
94  chr2 17128     T  TG PID4
95  chr2 17142     G  GA PID5
96  chr2 17151  TCAC   T PID1
> INDEL_mc <-
+   create_indel_mutation_catalogue_from_df(
+     in_dat = INDEL_df,
+     in_signature_df = PCAWG_SP_ID_sigs_df )
[1] "Indel sequence context attribution of total  200  indels. This could take a while..."
[1] "INDEL classification of total  200  INDELs This could take a while..."
> head(INDEL_mc)
           PID1 PID2 PID3 PID4 PID5
DEL_C_1_0     1    4    1    2    2
DEL_C_1_1     2    2    2    2    0
DEL_C_1_2     0    0    0    1    0
DEL_C_1_3     0    0    1    0    0
DEL_C_1_4     0    1    1    1    0
DEL_C_1_5+    0    0    0    1    0

I encourage you to post any further errors, questions or requests at https://support.bioconductor.org/tag/YAPSA/.

ADD COMMENT

Login before adding your answer.

Traffic: 1838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6