Entering edit mode
6.1 years ago
AS
•
0
Hi, I am trying to use vcf files downloaded from public database (NCBI SRA Bioproject:PRJNA486011). I am using only R to do that because I don’t know any other language. I have put the gz and the gz.tbi in the same folder.
>vcf.fl <- list.files("path/to/vcfs", full.names=TRUE, pattern=".vcf.gz$")
> tabx <- TabixFile("chr4_eva.vcf.gz", index=paste("chr4_eva.vcf.gz", "tbi", sep="."),yieldSize=10000)
> tabx
class: TabixFile
path: chr4_eva.vcf.gz
index: chr4_eva.vcf.gz.tbi
isOpen: FALSE
yieldSize: 10000
> names(tabx)
[1] ".self" ".->path" ".extptr" ".->yieldSize" ".refClassDef" ".->index" ".->.extptr" "index"
[9] "path" "yieldSize"
> vcf.rng <- readVcf(tabx)
Warning message:
In .Seqinfo.mergexy(x, y) :
The 2 combined objects have no sequence levels in common. (Use
suppressWarnings() to suppress this warning.)
Probably one file has notations ike 1,2,3...X,Y and your has chr1,chr2... Check that.
I succeed to check the data of the .vcf.gz but I haven't found a way to look inside the tbi. Could you please indicate me a way to do it ?
You have to compare your VCF against the VCF you downloaded, not the index. Please show a subset of your VCF and of the downloaded VCFs.
Why don't you index them quickly by yourself - this is very simple:
Thanks for the answers.
I have downloaded the vcf from : ftp://ftp.sra.ebi.ac.uk/vol1/ERZ696/ERZ696780/chr8_eva.vcf.gz. The size of the download file is ok compare to the size indicated on the website.
I have tried to look at the file and I found strange that the ID seem full of NA and no samples are detected.