filter gene expression matrix
1
0
Entering edit mode
5.8 years ago
Sam ▴ 150

Hi

I have a expression table in this format

Geneid  1-FPKM  2-FPKM  3-FPKM  1-T-GB3-FPKM    1-T-GM58-FPKM   1-T-MB43-FPKM   34-O-GB3-FPKM   34-O-GM58-FPKM  34-O-MB43-FPKM  34-T-GB3-FPKM   34-T-GM58-FPKM  34-T-MB43-FPKM
A00218.v2.0 0   0   0   0   0   0.090523725 0   0   0   0   0   0
A00223.v2.0 0   1.09881251  0   0   0.578112669 0.337275538 0.625452544 1.409278785 0.204600602 0   0.051926075 0
A00226.v2.0 0   0   0   0   0   0   0   0   0   0   0   0

output:

A00223.v2.0 0   1.09881251  0   0   0.578112669 0.337275538 0.625452544 1.409278785 0.204600602 0   0.051926075 0

how I can filter genes with FPKM over 1 and also available in at least one sample?

Thanks

gene-expression filter • 1.5k views
ADD COMMENT
1
Entering edit mode

Please show what have you tried?

ADD REPLY
2
Entering edit mode
5.8 years ago
$ awk -v FS="\t" 'NR>1 { for(i=2;i<=NF;i++) { if($i>=1) { print; break; }}}' input_file

From the second line onwards awk iterate over all sample fields in the line. If one value is greater or equal to 1 the line is printed and we step to the next line.

For readability reasons you can put the awk code in an extra file, e.g. filter.awk

NR>1 { 
    for(i=2; i<=NF; i++) { 
        if($i>=1) { 
            print; 
            break;
        }
    }
}

Run awk like this:

$ awk -v FS="\t" -f filter.awk input_file

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 1846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6