bash loop to count variants using vcftools
2
0
Entering edit mode
9.0 years ago
bioguy24 ▴ 230

I can run the below vcftools command to count individual variant types in vcf.gz files

zcat /home/cmccabe/Desktop/vcf/file1.vcf.gz | vcf-annotate --fill-type | grep -oP "TYPE=\w+" | sort | uniq -c > /home/cmccabe/Desktop/vcf/file1_variant_counts.bed

However when I try a bash loop the command does not run at all and I can not seem to figure out why, I think it looks right? Thank you :).

for f in /home/cmccabe/Desktop/vcf/*.vcf.gz ; do
     bname=`basename $f`
     pref=${bname%%.vcf.gz
     zcat /home/cmccabe/Desktop/vcf/$f | vcf-annotate --fill-type | grep -oP "TYPE=\w+" | sort | uniq -c > /home/cmccabe/Desktop/vcf/${pref}_variant_counts.bed
done

I think I see it now, I missed a closing brace in the pref=.... Thank you :).

vcftools bash • 3.1k views
ADD COMMENT
1
Entering edit mode

Exactly. The closing brace was the problem. Nice use of %% BTW :)

ADD REPLY
0
Entering edit mode

In your loop, f=/home/cmccabe/Desktop/vcf/*.vcf.gz; in your zcat command you open

"/home/cmccabe/Desktop/vcf/home/cmccabe/Desktop/vcf/*.vcf.gz"
ADD REPLY
0
Entering edit mode

It must be zcat $f

ADD REPLY
2
Entering edit mode
9.0 years ago
venu 7.1k
for f in /home/cmccabe/Desktop/vcf/*.vcf.gz
> do
> zcat $f | vcf-annotate --fill-type | grep -oP "TYPE=\w+" | sort | uniq -c > "$f"_variant_counts.bed
> done

Then change all the file names with rename command.

Ex:

rename "s/.vcf.gz_variant_counts.bed/.bed/" *.bed
ADD COMMENT
0
Entering edit mode

Thank you all :).

ADD REPLY
2
Entering edit mode
9.0 years ago

Try to use GNU Parallel for these kind of tasks:

parallel --jobs <int> "zcat {} |  vcf-annotate --fill-type | grep -oP \"TYPE=\\w+\" | sort | uniq -c > {.}_variant_counts.bed" ::: /home/cmccabe/Desktop/vcf/*.vcf.gz
ADD COMMENT

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6