"Incompatible genome" error when using locateVariants
1
0
Entering edit mode
10.1 years ago
dktarathym ▴ 40

I am having a vcf file :

> seqlevels(D784G)

[1] "chrI"    "chrII"   "chrIII"  "chrIV"   "chrIX"   "chrM"    "chrV"    "chrVI"   "chrVII"  "chrVIII"
[11] "chrX"    "chrXI"   "chrXII"  "chrXIII" "chrXIV"  "chrXV"   "chrXVI" 

> seqlevels(TxDb.Scerevisiae.UCSC.sacCer3.sgdGene)

[1] "chrI"    "chrII"   "chrIII"  "chrIV"   "chrV"    "chrVI"   "chrVII"  "chrVIII" "chrIX"   "chrX"   
[11] "chrXI"   "chrXII"  "chrXIII" "chrXIV"  "chrXV"   "chrXVI"  "chrM"   

> loc <- locateVariants(D784G, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, AllVariants())

Error in mergeNamedAtomicVectors(genome(x), genome(y), what = c("sequence",  : 
  sequences chrI, chrII, chrIII, chrIV, chrIX, chrM, chrV, chrVI, chrVII, chrVIII, chrX, chrXI, chrXII, chrXIII, chrXIV, chrXV, chrXVI have incompatible genomes:
  - in 'x': TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene, TxDb.Scerevisiae.UCSC.sacCer3.sgdGene
  - in 'y': sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, sacCer3, 

How to handle it?

Do I need to re arrange the seqlevels of my vcf file? If yes, then how?

Sincerely,
Deepak

R • 4.3k views
ADD COMMENT
0
Entering edit mode
intersect(seqlevels(D784G),seqlevels(txdb))
 [1] "chrI"    "chrII"   "chrIII"  "chrIV"   "chrIX"   "chrM"    "chrV"    "chrVI"   "chrVII"  "chrVIII"
[11] "chrX"    "chrXI"   "chrXII"  "chrXIII" "chrXIV"  "chrXV"   "chrXVI" 
ADD REPLY
1
Entering edit mode
10.1 years ago

Have you tried changing the genome with genome(D784G) <- "sacCer3"? That might solve the problem.

ADD COMMENT
0
Entering edit mode

Hi Devon,

Thank you so much.

It seems to work. Also, there are some warnings:

> genome(D784G) <- "sacCer3"
> loc <- locateVariants(D784G, txdb, AllVariants())
Warning messages:
1: In `start<-`(`*tmp*`, value = c(-1665L, -1462L, 480L, 8091L, 10046L,  :
  trimmed start values to be positive
2: In `end<-`(`*tmp*`, value = c(534L, 737L, 2679L, 10290L, 12245L,  :
  trimmed end values to be <= seqlengths
3: In `start<-`(`*tmp*`, value = c(-981575L, -802723L, -802656L, -802572L,  :
  trimmed start values to be positive
4: In `end<-`(`*tmp*`, value = c(1018425L, 1197277L, 1197344L, 1197428L,  :
  trimmed end values to be <= seqlengths
> loc
GRanges with 9269 ranges and 7 metadata columns:
                 seqnames           ranges strand   |   LOCATION   QUERYID      TXID
                    <Rle>        <IRanges>  <Rle>   |   <factor> <integer> <integer>
  chrI:18425_T/C     chrI [ 18425,  18425]      *   | intergenic         1      <NA>
                     chrI [ 25479,  25479]      -   |     coding         2        66
                     chrI [141030, 141030]      +   |     coding         3        38
                     chrI [141030, 141030]      +   |   promoter         3        39
                     chrI [141030, 141030]      -   |     coding         3       103
             ...      ...              ...    ... ...        ...       ...       ...
                   chrXVI [869062, 869062]      -   |   promoter      4301      6644
                   chrXVI [869062, 869062]      -   |     coding      4301      6645
                   chrXVI [890346, 890346]      +   |     coding      4302      6395
                   chrXVI [890346, 890346]      +   |   promoter      4302      6396
                   chrXVI [890346, 890346]      -   |   promoter      4302      6650
                     CDSID      GENEID                   PRECEDEID
                 <integer> <character>             <CharacterList>
  chrI:18425_T/C      <NA>        <NA> YAL001C,YAL002W,YAL003W,...
                        68     YAL063C                            
                        39     YAL004W                            
                      <NA>     YAL003W                            
                       105     YAL005C                            
             ...       ...         ...                         ...
                      <NA>     YPR162C                            
                      6972     YPR163C                            
                      6709     YPR175W                            
                      <NA>     YPR178W                            
                      <NA>     YPR174C                            
                                        FOLLOWID
                                 <CharacterList>
  chrI:18425_T/C YAL064C-A,YAL064W-B,YAL065C,...




             ...                             ...





  ---
  seqlengths:
      chrI   chrII  chrIII   chrIV   chrIX ...  chrXII chrXIII  chrXIV   chrXV  chrXVI
    230218  813184  316620 1531933  439888 ... 1078177  924431  784333 1091291  948066
ADD REPLY
0
Entering edit mode

Are the seqlengths of D784G and txdb the same? I've never actually seen warnings about negative coordinates. Presuming you don't actually have any I don't know off-hand what's causing that.

ADD REPLY
0
Entering edit mode

The seqlengths were same.

ADD REPLY
0
Entering edit mode

No clue then. I'd have to go through the source code or try it myself. Perhaps post this on the bioconductor board.

ADD REPLY

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6