Entering edit mode
17 months ago
Can Abdullah
•
0
I have VCF files and with the help of this bash script, I am trying to convert them to avinput file type to process faster during annotation process. But strange thing is when the conversion is complete, the number of variants increases. There are more variants in the avinput file than the vcf file. Why can this be and what should I do to match the number of variants?
The code:
ANNOVAR_PATH="Desktop/THESIS_PROJECT/annovar/"
VCF_FOLDER="Desktop/THESIS_PROJECT/PIPELINE/Picard_Outputs/CAGI5/"
OUTPUT_FOLDER="Desktop/THESIS_PROJECT/PIPELINE/Avinputs/LIFTED/CAGI5/"
for vcf_file in "$VCF_FOLDER"/*.vcf; do
base_name=$(basename "$vcf_file" .vcf)
"$ANNOVAR_PATH"/convert2annovar.pl -format vcf4 "$vcf_file" > "$OUTPUT_FOLDER"/"$base_name".avinput
done
How do you count the number of variants in each file?
I simply open the vcf file and avinput file to see.
"I open the file" doesn't mean anything. Please explain.