How To Interprete/Display .Vcf Data File?
4
4
Entering edit mode
11.4 years ago
newDNASeqer ▴ 790

I am learning to do Variant Calling by using pipelines such as BWA + PiCard + GATK or BWA + Samtools/Varscan. Now, I've got a few .vcf files from those pipelines. I was wondering what would be the best way to interprete or display those data in the .vcf files. thanks

vcf variant calling • 12k views
ADD COMMENT
2
Entering edit mode

You're going to have to provide much more information to get any sort of reasonable help. What kinds of samples to you have, and what kinds of questions are you trying to answer?

ADD REPLY
3
Entering edit mode
11.4 years ago
bioinfo ▴ 840
  1. You can use igv to display the vcf files. I normally use Artemis to visualise the vcf/sam/bam files. There are other choices as well. It's up to you to choose one but igv is most popular.
  2. You can use vcftools to get lots of statistics from your vcf files which might help you to interpret the SNPs data.
ADD COMMENT
1
Entering edit mode
11.4 years ago
Ming Tommy Tang ★ 4.5k

you are saying visualizing the vcf file? igv is a good one for that purpose

ADD COMMENT
1
Entering edit mode
11.4 years ago

A vcf is a tab delimited text file. Look at it with anything. Throwing it into Excel works fine.

ADD COMMENT
0
Entering edit mode

That's exactly what I was trying to ask. More specifically, I was trying to figure out how to extract information (SNPs, Indels, etc) from the VCF files? For example, I want to have a list of mutations or deletion/insertions on chromosomes with position information, and ideally the list is in Excel spreadsheets that are easy to read/understand. Any existing programs can help with realizing this? thanks again

ADD REPLY
2
Entering edit mode

Spend some time getting familiar with the command line and tools like less, cut, sort, et al. Some simple perl scripting would also go a long way towards helping you slice and dice these files. As you'll quickly learn, even if you can open a 3-million line VCF file in excel, poking around by hand is not a particularly tractable way to extract information.

ADD REPLY
0
Entering edit mode

Oh, Excel. An old issue, but one that a lot of us continue to see.

And, apparently, there's a new blog just on this subject! :D

ADD REPLY
0
Entering edit mode
11.4 years ago
alaincoletta ▴ 170

We use IGV to visulaize BAM files, VCFs and clinical annotations together. We also use Gene-E heatmap vizualiser tool with samples as columns and variants as rows. We use Gene symbols as row descriptions which allows us to collapse by gene. This can be useful if you are looking at many exome samples simultaneously. Some public examples here https://insilicodb.org

Hope this helps,

ADD COMMENT

Login before adding your answer.

Traffic: 2613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6