Hi,
I am working on some whole exome data, which were annotated using the web based ANNOVAR (wANNOVAR). I downloaded the .csv
output. When I open the file with Excel, a) my laptop slows down quite a lot, and b) there are a lot of columns, which lead to making errors at times. Below is a sample of the columns I'm focusing on
Gene.refGene ExonicFunc.refGene AAChange.refGene 1000G_ALL ExAC_Freq CADD_phred gnomAD_exome_ALL gnomAD_genome_ALL
PRAMEF10 nonsynonymous_SNV PRAMEF10:NM_001039361:exon3:c.A298T:p.I100F 0.00005 0.0002 27.2 1.06E-10 .
AURKA nonsynonymous_SNV AURKA:NM_0010391941:exon12:c.G893T:p.A298S 0.52 0.31 19.8 0.38 0.302
I want to filter all columns from the second column to obtain non_synonymous
mutations, with the 1000G_ALL
, ExAC_Freq
, gnomAD_exome_ALL
, and gnomAD_genome_ALL
all being less than 0.01
, or equal to .
, and CADD_Phred
>= 20.
Is there a way to do all of this on the terminal without having to open Excel?
P.S. To ensure the ease of viewing, I have not included all the columns present in the file.
Thanks in advance.