Entering edit mode
4.4 years ago
kumari.indu31
•
0
We re trying to remove PE-PPE proteins from filtered .vcf file using the following commands:
intersectBed -a R883_Filter.vcf -b pe_ppe.bed -header > output.vcf
vcftools --vcf R883.vcf --exclude-positions-overlap pe_ppe_pos.txt --recode --recode-INFO-all --out R883_pos.vcf
bedtools intersect -u -a R883.vcf -b pe_ppe.bed > R883_no_PE.vcf
The output file is fine, however when we are using Annovar, to obtain the file for further analysis using command:
table_annovar.pl R883.vcf -buildver MTB -out R883_anno -remove -protocol refGene -operation g -nastring . --vcfinput
In the final annotated file, we are observing PE and PPE proteins. Can someone give a solution to this problem?
vcftools is deprecated. how about getting the complement of pe_ppe.bed with
bedtools complement
and then just use:It is not very clear what is the content of
pe_ppe.bed
. In case you want to exclude variants that overlap regions in thepe_ppe.bed
file you can simply useintersectBed
with the-v
flag.We have tried this command as well: bcftools view --targets-file not-pe-ppe.bed R883_Filter.vcf > R883_no_PE.vcf
However, when we annotate file using annovar then we are still finding PE-PPE genes in the final plot.
Please show the entries from ANNOVAR. They could be referring to upstream and downstream of the gene. Removing variants from a VCF is as simple as
bcftools view --exclude
We did removed proteins using bcftools view --exclude, however, after running Annovar, we are facing same problem.
Please use
ADD REPLY/ADD COMMENT
when responding to existing posts to keep threads logically organized.SUBMIT ANSWER
is for new answers to original question.You did not quite answer my question: