pangenome - Create a diagram venn
1
0
Entering edit mode
19 months ago
BATMAN • 0

Hello, I would like to know if you can help me. I want to make a venn diagram with the presence and absence data (.Rtab) of roary (example fragment, the real list is about 8000 genes):

Gene    StrainA StrainB StrainC
group_633   1   0   1
group_644   1   0   1
group_669   1   0   1
ybeZ    1   1   1
maeB    1   1   1
smc_4   1   0   1
cas4-cas1   1   1   1
group_813   0   1   1
group_844   1   1   1
group_854   1   0   0
group_45    0   0   0
group_124   0   0   1
group_323   0   1   0

How can I delete the rows where the following pattern appears?

name 0 0 0

For Venn diagram:

The core gene is:

name 1 1 1

Accessory a+b:

name 1 1 0

Accessory b+c:

name 0 1 1

Accessories a+c:

name 1 0 1

unique a:

name 1 0 0

unique b:

name 0 1 0

unique c:

name 0 0 1

Thanks

roary grep awk pangenome venn • 1.0k views
ADD COMMENT
2
Entering edit mode
19 months ago
iraun 6.2k

As for your first question, how to delete rows with zero values, assuming that you have three columns with values as in your example, you can use the following awk command:

awk -F'\t' '{z=0; for (i = 1 ; i <= NF ; i++) if ($i == 0) z++} z < 3' input.tsv > output.tsv

If you have more than three value columns, replace z < 3 by. 4, 5, etc.

As for creating the venn diagram, I will separate the columns into different files with :

awk -F'\t' '$2 == 1 {print $1}' test.tsv > StrainA.tsv
awk -F'\t' '$3 == 1 {print $1}' test.tsv > StrainB.tsv
awk -F'\t' '$4 == 1 {print $1}' test.tsv > StrainC.tsv

And then use your venn diagram tool of interest with those three files, for example venny.

ADD COMMENT

Login before adding your answer.

Traffic: 2832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6