SNP annotation using R package
0
0
Entering edit mode
7.7 years ago
adakoury • 0

I have a VCf file from my GBS data. I am trying to do SNP annotation. I am using R package. i have got my vcf read using the "vcf <- readVcf("my vcf file", "Brasica napus genome")

but then I lost my way to do the annotation. I am wondering if you anybody has some guiding suggestions

Thanks

R • 5.9k views
ADD COMMENT
0
Entering edit mode

Which annotation do you want to add? Does it have to be R?

ADD REPLY
0
Entering edit mode

I want to know the distribution of SNPs within the genome (coding sequence, UTR, interons, etc). Yes. Unfortunately, it has to be R, because this only packge I have installed. withy the government computer it is very length to have know software installed.

ADD REPLY
0
0
Entering edit mode

Actually, I am using this package, everthing worked fine but I could not use "locateVariants" function. The manual I have is about human genome but i am working on on plant genome. I have the reference genome and the gff, gff3 files on my computer, but I do not know to use them.

ADD REPLY
0
Entering edit mode

Why don't you say from the beginning which package you are using? How can we guess that?

but I could not use "locateVariants" function

Please be more specific. You are making this too hard. What was the error message? What happened? What didn't happen? We can't see your screen, you know.

ADD REPLY
0
Entering edit mode

Here what I did and what I got

> library("VariantAnnotation")
> vcf <- readVcf("all.filtered.recode.vcf", "Brassica_napus.annotation_v5_sorted_modified.gff")
> vcf
class: CollapsedVCF 
dim: 173864 189 
rowRanges(vcf):
  GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
  DataFrame with 3 columns: NS, DP, AF
info(header(vcf)):
      Number Type    Description                
   NS 1      Integer Number of Samples With Data
   DP 1      Integer Total Depth                
   AF .      Float   Allele Frequency           
geno(vcf):
  SimpleList of length 5: GT, AD, DP, GQ, PL
geno(header(vcf)):
      Number Type    Description                                               
   GT 1      String  Genotype                                                  
   AD .      Integer Allelic depths for the reference and alternate alleles ...
   DP 1      Integer Read Depth (only filtered reads used for calling)         
   GQ 1      Float   Genotype Quality                                          
   PL 3      Float   Normalized, Phred-scaled likelihoods for AA,AB,BB genot...

What else I need to do to eventually be able to run "locateVariant" function? Again I have the reference genome, gff, gff3 and vcf files on my computer, and remeber i am working on Brassica napus genome.

ADD REPLY

Login before adding your answer.

Traffic: 1625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6