error related to vcfstats
0
0
Entering edit mode
3.4 years ago
rheab1230 ▴ 140

Hello everyone,

I am trying to use vcfstats tool.

I am writing the following command but its showing error:

vcfstats -h 

the error:

Traceback (most recent call last):
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py", line 16, in <module>
    from . import multiarray
ImportError: cannot import name 'multiarray' from partially initialized module 'numpy.core' (most likely due to a circular import) (/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anaconda3/bin/vcfstats", line 5, in <module>
    from vcfstats import main
  File "/home/anaconda3/lib/python3.8/site-packages/vcfstats/__init__.py", line 9, in <module>
    from cyvcf2 import VCF
  File "/home/anaconda3/lib/python3.8/site-packages/cyvcf2/__init__.py", line 1, in <module>
    from .cyvcf2 import (VCF, Variant, Writer, r_ as r_unphased, par_relatedness,
  File "cyvcf2/cyvcf2.pyx", line 1, in init cyvcf2.cyvcf2
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py", line 26, in <module>
    raise ImportError(msg)
ImportError:
Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control).  Otherwise reinstall numpy.

Original error was: 
cannot import name 'multiarray' from partially initialized module 'numpy.core' (most likely due to a circular import) (/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py)

I have install numpy package:

pip install numpy

the output of the above command:

Requirement already satisfied: numpy in /opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages (1.15.3) 

Can anyone please help me out with this.

Thank You.

numpy snp vcf vcfstats • 3.3k views
ADD COMMENT
0
Entering edit mode

In my experience, mixing pip with conda hasn't always played out well. You may want to conda install numpy to make sure everyone knows where everyone else lives.

ADD REPLY
0
Entering edit mode

I did the above. but I am still getting error.

    Traceback (most recent call last):
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py", line 16, in <module>
    from . import multiarray
ImportError: cannot import name 'multiarray' from partially initialized module 'numpy.core' (most likely due to a circular import) (/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kxj190026/anaconda3/bin/vcfstats", line 5, in <module>
    from vcfstats import main
  File "/home/kxj190026/anaconda3/lib/python3.8/site-packages/vcfstats/__init__.py", line 9, in <module>
    from cyvcf2 import VCF
  File "/home/kxj190026/anaconda3/lib/python3.8/site-packages/cyvcf2/__init__.py", line 1, in <module>
    from .cyvcf2 import (VCF, Variant, Writer, r_ as r_unphased, par_relatedness,
  File "cyvcf2/cyvcf2.pyx", line 1, in init cyvcf2.cyvcf2
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py", line 26, in <module>
    raise ImportError(msg)
ImportError:
Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control).  Otherwise reinstall numpy.

Original error was: cannot import name 'multiarray' from partially initialized module 'numpy.core' (most likely due to a circular import) (/opt/ohpc/pub/libs/gnu8/numpy/1.15.3/lib64/python3.4/site-packages/numpy/core/__init__.py)
ADD REPLY
0
Entering edit mode

Does vcfstats have a github repo? You may want to browse through and if necessary, open an issue there.

ADD REPLY
0
Entering edit mode

Yes, it has. I will try to open an issue there.

ADD REPLY
0
Entering edit mode

Hello, I am trying to run vcfstats. the command:

vcfstats --vcf GEUVADIS.chr1.genotype.vcf.gz --outdir new --formula 'COUNT(1) ~ CONTIG' --title 'Number of variants on each chromosome' --config config.toml

the error:

[2021-08-04 09:37:30,895 INFO] 3003378 variants read.
[2021-08-04 09:37:30,896 INFO] [T] Summarizing aggregations ...
[2021-08-04 09:37:30,931 INFO] [T] Composing R code ...
[2021-08-04 09:37:30,956 INFO] [T] Running R code to plot ...
[2021-08-04 09:37:30,956 INFO] [T] Data will be saved to: new/T.txt
[2021-08-04 09:37:30,956 INFO] [T] Plot will be saved to: new/T.col.png

Its creating two new files T.txt and T.plot.R but T.col.png is not getting generated. Can you please help me out with this.

ADD REPLY
0
Entering edit mode

I don't see an error message in the output you posted above. Is that all the messages the command printed out?

ADD REPLY
0
Entering edit mode

There is no error. But the final file containing the plot is not getting generated.

ADD REPLY
0
Entering edit mode

There should be an error message. What is the program exit code? You will need to run the program again and immediately after it's done, run echo $?. Do this:

vcfstats --vcf GEUVADIS.chr1.genotype.vcf.gz --outdir new --formula 'COUNT(1) ~ CONTIG' --title 'Number of variants on each chromosome' --config config.toml; echo $?

Also, what's the content of T.plot.R?

ADD REPLY
0
Entering edit mode

Content of T.plot.R

require('ggplot2')
set.seed(8525)
figtype = 'col'

plotdata = read.table(paste0('new/T', '.txt'),
                        header = TRUE, row.names = NULL, check.names = FALSE, sep = "       ")
cnames = make.unique(colnames(plotdata))
colnames(plotdata) = cnames

bQuote = function(s) paste0('`', s, '`')

png(paste0('new/T', '.', figtype, '.png'),
    height = 2000, width = 2000, res = 300)
if (length(cnames) > 2) {
    aes_for_geom = aes_string(fill = bQuote(cnames[3]))
    aes_for_geom_color = aes_string(color = bQuote(cnames[3]))
    plotdata[,3] = factor(plotdata[,3], levels = rev(unique(as.character(plotdata[,3]))))
} else {
    aes_for_geom = NULL
    aes_for_geom_color = NULL
}
p = ggplot(plotdata, aes_string(y = bQuote(cnames[1]), x = bQuote(cnames[2])))
xticks = theme(axis.text.x = element_text(angle = 60, hjust = 1))
if (figtype == 'scatter') {
    p = p + geom_point(aes_for_geom_color)
# } else if (figtype == 'line') {
#       p = p + geom_line(aes_for_geom)
} else if (figtype == 'bar') {
    p = ggplot(plotdata, aes_string(x = bQuote(cnames[2])))
    p = p + geom_bar(aes_string(fill = bQuote(cnames[1]))) + xticks
} else if (figtype == 'col') {
    p = p + geom_col(aes_for_geom) + xticks
} else if (figtype == 'pie') {
    library(ggrepel)
    if (length(cnames) > 2) {
        p = p + geom_col(aes_for_geom) + coord_polar("y", start=0) +
            geom_label_repel(
                aes_for_geom,
                y = cumsum(plotdata[,1]) - plotdata[,1]/2,
                label = paste0(unlist(round(plotdata[,1]/sum(plotdata[,1])*100,1)), '%'),
                show.legend = FALSE)
        plotdata[,1] = factor(plotdata[,1], levels = rev(unique(as.character(plotdata[,1]))))
        fills = rev(levels(plotdata[,1]))
        sums  = sapply(fills, function(f) sum(plotdata[,1] == f))
        p = ggplot(plotdata, aes_string(x = bQuote(cnames[2]))) +
                    geom_bar(aes_string(fill = bQuote(cnames[1]))) + coord_polar("y", start=0) +
                    geom_label_repel(
                        inherit.aes = FALSE,
                        data = data.frame(sums, fills),
                        x = 1,
                        y = cumsum(sums) - sums/2,
                        label = paste0(unlist(round(sums/sum(sums)*100,1)), '%'),
                        show.legend = FALSE)
            }
        p = p + theme_minimal() + theme(axis.title.x = element_blank(),
                axis.title.y = element_blank(),
                axis.text.y =element_blank())
} else if (figtype == 'violin') {
    p = p + geom_violin(aes_for_geom) + xticks
} else if (figtype == 'boxplot') {
    p = p + geom_boxplot(aes_for_geom) + xticks
} else if (figtype == 'histogram' || figtype == 'density') {
    plotdata[,2] = as.factor(plotdata[,2])
    p = ggplot(plotdata, aes_string(x = bQuote(cnames[1])))
    params = list(alpha = .6)
    if (cnames[2] != '1') {
        params$mapping = aes_string(fill = bQuote(cnames[2]))
    }
    p = p + do.call(paste0("geom_", figtype), params)
} else if (figtype == 'freqpoly') {
    plotdata[,2] = as.factor(plotdata[,2])
    p = ggplot(plotdata, aes_string(x = bQuote(cnames[1])))
    if (cnames[2] != '1') {
        params$mapping = aes_string(color = bQuote(cnames[2]))
    }
    p = p + do.call(paste0("geom_", figtype), params)
} else {
    stop(paste('Unknown plot type:', figtype))
}
p = p + scale_x_discrete(name ="Chromosome", \
    limits=c("1","2","3","4","5","6","7","8","9","10","X")) + \
    ylab("# Variants")ls
print(p)
dev.op()
ADD REPLY
0
Entering edit mode

This looks like code to create the plot. I do see a bunch of problems with the code (which could be copy-paste issues) - can you upload the file to a GitHub gist instead of copy-pasting it here? Make sure to upload the file as is, not copy-paste it to the gist.

ADD REPLY
0
Entering edit mode

Okay, I will do that.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks. The ylab(...)ls part stands out to me. Are you using a config file or somehow supplying a --ggs value?

Plus, see the block in the gist under figtype=='pie' (lines 34-59, especially line 43) and how it's missing in your comment above, making the p = p+ line above it meaningless? It is probably a copy paste error - please be more careful in the future, as these errors can be insanely hard to detect when they're even a tiny bit less obvious.

ADD REPLY
0
Entering edit mode

I am sorry for that. I will definitely be careful next time. Yes, I am providing a config file.

ADD REPLY
0
Entering edit mode

Do you mention any ggplot parameters in the config file?

ADD REPLY
0
Entering edit mode
passed = true

[[one]]
formula = 'DEPTHs{0} ~ CHROM'
title = 'Depth distribution on each chromosome'
ggs = 'theme_minimal()'
devpars = {width = 1000, height = 1000, res = 100}

[[one]]
formula = 'AAF ~ CHROM'
title = 'Allele frequency distribution on each chromosome'
ggs = 'theme_bw()'
devpars = {width = 2000, height = 2000, res = 300}
ADD REPLY
0
Entering edit mode

That's the exact content from their example in the manual, so it should work. This is getting to a point where the developer would be the best person to help out, so email them or open an issue on github.

ADD REPLY
0
Entering edit mode

Yes, I have contact the developer regarding this.

ADD REPLY
0
Entering edit mode

I also have one more doubt. I am trying to run this command:

 vcfstats --vcf allfiles1.vcf.gz --outdir new/ --formula 'COUNT(1, VARTYPE[snp]) ~ SUBST[A>T,A>G,A>C,T>A,T>G,T>C,G>A,G>T,G>C,C>A,C>T,C>G]' --title 'Number of substitutions of SNPs' --config config.toml

I am getting this error:

Traceback (most recent call last):
  File "/home1/08259/kjoshi/miniconda3/bin/vcfstats", line 8, in <module>
    sys.exit(main())
  File "/home1/08259/kjoshi/miniconda3/lib/python3.8/site-packages/vcfstats/cli.py", line 156, in main
    load_config(opts['config'], opts)
  File "/home1/08259/kjoshi/miniconda3/lib/python3.8/site-packages/vcfstats/cli.py", line 138, in load_config
    def_devpars = default_devpars.copy()
AttributeError: 'Namespace' object has no attribute 'copy'

I downloaded and updated the software to latest version.

ADD REPLY

Login before adding your answer.

Traffic: 1021 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6