Automatize extraction of a VCF column value
2
0
Entering edit mode
2.4 years ago
jomagrax ▴ 40

I need to automatize a command line software (pvacseq tools) where I need the 12th column of the row that contains "CHROM" of each file to be a parameter, this is what I have thought:

for i in *.vcf  ; do  pvacseq run \ 
$i \ 
awk'/CHROM/{print $12}' ${i}\ 
HLA-A*02:01,HLA-B*35:01,DRB1*11:01 \
MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \ 
"${i%%_vep*}_pvac-result"   ; done

The problem seems that with each space of the awk command, pvacseq unserstands that a new option It's been introduced.

So I guess what I need is a way of automatizing the extraction of that column in a single command without spaces or a way for the program to understand that the awk command is a single command even thought if It has spaces.

About the replication of the problem I don't know how to approach It since installing pvaseq can be complicated.

vcf bash • 766 views
ADD COMMENT
1
Entering edit mode
2.4 years ago
jomagrax ▴ 40

I found the solution

for i in *.vcf
do
    x=$(awk '/CHROM/{print $12}' ${i})
    pvacseq run $i $x HLA-A*02:01,HLA-B*35:01,DRB1*11:01 ... 
    #              ^^
done
ADD COMMENT
0
Entering edit mode
2.4 years ago

I'm not sure I understand your problem instead of awk'/CHROM/{print $12}' ${i}\

may be you just want:

$( bcftools query -l "${i}" | awk '(NR==3)')

using a command substitution https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html

ADD COMMENT

Login before adding your answer.

Traffic: 1919 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6