How can I compare two columns of a two VCF files
2
1
Entering edit mode
8.9 years ago
kinsioberoi ▴ 10

How can I compare two columns of a VCF file (i.e. chromosome number and position) and return the entire row if both the position and chromosome number match in two diffrent VCF files? The following awk command works to compare one column what changes can I make in it to make it compare two columns together.

awk 'FNR==NR{a[$1]=$0;next}{if(b=a[$1]){print b}}' fileA fileB > out
next-gen SNP • 3.9k views
ADD COMMENT
0
Entering edit mode

Can you please post an example of the input and desired output? Thanks.

ADD REPLY
0
Entering edit mode

so for example if File A and B are as following the code with match the 1st column (#CHROM) and the 2nd column(POS) and return the output from A which is matching

File A

#CHROM  POS     ID            REF  ALT
chr1    64      rs3883910     C    T
chr2    146     .             G    A
chr2    146     rs72619361    T    C

File B

#CHROM    POS     ID           REF  ALT
chr1      64      rs3883910    C    T
chr1      146     rs982818     C    T
chr2      146     rs72619361   T    C

OUTPUT

#CHROM    POS      ID            REF  ALT
chr1      64       rs3883910     C    T
chr2      146      rs72619361    T    C
ADD REPLY
0
Entering edit mode

return the entire row if both the position and chromosome number match in two diffrent VCF files.

Return from which file? FileA or FileB?

May be this vcftools solution will work for you.

ADD REPLY
3
Entering edit mode
8.9 years ago
Vivek ★ 2.7k

Looks like you want to intersect two VCF files. There's already software to do it and its much cleaner to use them than write a messy one-liner. Check Bedtools intersectBed.

http://bedtools.readthedocs.org/en/latest/content/tools/intersect.html

ADD COMMENT
0
Entering edit mode
8.9 years ago

You can also use VCFtools (available here) to compare VCFs.

ADD COMMENT

Login before adding your answer.

Traffic: 2440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6