Entering edit mode
10.7 years ago
mad.cichlids
▴
140
I want to extract some information from the info column of my VCF file, when I load my VCF file, I do not have access to my depth (DP) and number of alleles (AN), however, the NS(number of samples for this allele works fine), DP and AN shown as "NA". My vcf file clearly show that neither DP nor AN is empty, here is just one line example of my vcf file, thanks!
gi|393925858|gb|AGTA02071966.1| 0000000739 . G A 121.20 PASS NS=74:AN=2:DP=8448 GT:DP:GQ:EC:SG 0/1:262:144:116:R
> vcf <- readVcf("z.vcf", "Genome")
> hdr <- exptData(vcf)[["header"]]
> info(hdr)
DataFrame with 3 rows and 3 columns
Number Type Description
<character> <character> <character>
NS 1 Integer Number of Samples With Data
DP 1 Integer Total Depth
AN 1 Integer Number of Alleles in Population
> info(vcf)
DataFrame with 2648 rows and 3 columns
NS DP AN
<integer> <integer> <integer>
gi|393925858|gb|AGTA02071966.1|:0000000739 74 NA NA
gi|393925858|gb|AGTA02071966.1|:0000000781 74 NA NA
gi|393925983|gb|AGTA02071903.1|:0000000957 74 NA NA
gi|393925983|gb|AGTA02071903.1|:0000000960 74 NA NA
gi|393925983|gb|AGTA02071903.1|:0000001007 73 NA NA
... ... ... ...
Can you be more specific? How are you reading the files? Bioconductor? Package name? Version?
Thanks. I used
vcf <- readVcf("z.vcf", "Genome")
, z.vcf is my vcf file, "Genome" is the folder of my indexed ref genome. The package is variantannotation in Bioconductor. Here is how i installed it:According to the archive in the Bioconductor, this should be Version1.8.13. Please let me know if you need additional information that I can provide, I really appreciate your comment. And I am running R in ubuntu