Entering edit mode
5.5 years ago
felipead66
▴
120
I have a vcf file after using snpEff annotation. I want to extract information from this file so I search for synonymous (and non-synonymous) sites. But when I do
grep -c "SYNONYMOUS" snpEffoutput.vcf
I get 0 results.
Does this mean there is something wrong with my file?
Can you show the command line you use , and part of your output file snpEffoutput.vcf ?
Best
Try:
grep -c 'synonymous_variant' snpEffoutput.vcf
?Yes, this gives me 81949 results. But again, when I do
grep -c 'non_synonymous' snpEffoutput.vcf I
still get 0 results.You can add
-csvStats
when running snpEff:java -jar snpEff.jar eff -csvStats snpEffoutput.csv snpEff_database snpEffinput.vcf > snpEffoutput.vcf
. There will be a section counting each effect in the csv:Thank you very much, that is every helpful. But, still, how do I get the non_synonymous?
Furthermore, I assume that the synonymous_variant are in the coding region?
Hi felipead66,
Please check out the "Effect prediction details" section on Input & output files from snpEff document. Starting from version 4.0 VCF output uses SO terms by default, so the classic "NON_SYNONYMOUS_CODING" is now "missense_variant", "initiator_codon_variant", and "stop_retained_variant". If you add
-classic
when running snpEff, you can still count them bygrep -c 'NON_SYNONYMOUS'
.Hope it helps.
You mean the command line to create the snpEffoutput.vcf file?
Yes but as @SMK said you can do
grep -c 'synonymous_variant' snpEffoutput.vcf