KaryotypeR for germline mutations
1
0
Entering edit mode
5.4 years ago
ma23 ▴ 40

Hi everyone, I am trying to use the R package KaryotypeR to produce rainfall plots in R for germline mutations. For this aim I use the code I found here: VCF input for KaryotypeR But I get an error:

Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : 
  solving row 1: range cannot be determined from the supplied arguments (too many NAs)
In addition: Warning message:
In toGRanges(tmp.vcf.data) : NAs introduced by coercion

What can be done to fix this ?

Thanks for your help.

R • 1.5k views
ADD COMMENT
1
Entering edit mode

Great, this works much better. Thank you for your help!

ADD REPLY
0
Entering edit mode

Hi @ma23

Can you show us a few lines of the VCF? and the exact code you are using?

ADD REPLY
0
Entering edit mode

Here is the exact code:

>library(karyoploteR) 

>tmp.vcf<-readLines("file.vcf")

>tmp.vcf.data<-read.table("file.vcf")

>tmp.vcf<-tmp.vcf[-(grep("#CHROM",tmp.vcf)+1):-(length(tmp.vcf))]

>vcf.names<-unlist(strsplit(tmp.vcf[length(tmp.vcf)],"\t"))

>names(tmp.vcf.data)<-vcf.names

>colnames(tmp.vcf.data)[1] <- "Chr"

>colnames(tmp.vcf.data)[2] <- "Start"

>tmp.vcf.data = cbind(tmp.vcf.data,End=rep(tmp.vcf.data$Start+1))

>tmp.vcf.data <- tmp.vcf.data[, c(1,2, 3:11)]

>toGRanges(tmp.vcf.data)

And here are three first lines:

CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  normal

chr10   75185   .       A       AT      70      PASS    AC=2;AF=1.00;AN=2;DP=5;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=55.11;QD=17.50;SOR=1.609 GT:AD:DP:GQ:PL  1/1:0,4:4:12:107,12,0

chr10   209747  .       TTCTTTA T       457.73  PASS    AC=2;AF=1.00;AN=2;DP=11;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=25.36;SOR=4.977        GT:AD:DP:GQ:PL  1/1:0,11:11:33:495,33,0

chr10   225362  .       G       GTGTT   98.25   PASS    AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=32.75;SOR=2.833 GT:AD:DP:GQ:PL  1/1:0,3:3:9:135,9,0
ADD REPLY
0
Entering edit mode

Please highlight code with the formatting bar:

enter image description here

ADD REPLY
1
Entering edit mode
5.4 years ago
bernatgel ★ 3.4k

Hi @ma23,

You have a few problems with your code, starting by the fact that you are reading the data twice and grepping for "#CHROM" and the header line starts with "CHROM" without hash, so it won't detect it.

Basically, if you have a headerless vcf and only want to create a GRanges to plot the variants in a rainfall plot you can use this much shorter and simpler code (note that we are repeating column 2 to have "start" and "end" as required by toGRanges)

> vcf.data <- read.table("file.vcf", header=TRUE)
> vars <- toGRanges(vcf.data[,c(1,2,2,4,5)])
> vars

GRanges object with 3 ranges and 2 metadata columns:
      seqnames    ranges strand |      REF      ALT
         <Rle> <IRanges>  <Rle> | <factor> <factor>
  [1]    chr10     75185      * |        A       AT
  [2]    chr10    209747      * |  TTCTTTA        T
  [3]    chr10    225362      * |        G    GTGTT
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

Another option would be to read the VCF file without removing the header using the functions in the VariantAnnotation package. If you need more information on that second option I can give you a few hints.

Hope this helps

Bernat

ADD COMMENT

Login before adding your answer.

Traffic: 2727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6