How to fillna() in Pandas pivoted tables
0
0
Entering edit mode
6.6 years ago
Eric Lim ★ 2.2k

I don't spend much time using pandas. I spent a couple hours yesterday searching on Google but couldn't get it done. While I came up with a hacky workaround, I thought I'd ask the experts here for the pandas' way.

The following is a contrived example but should show what I want to accomplish.

import pandas as pd

data = [{'gene': 'gene1', 'sj_coord': 'chr1:1-2', 'as_event': 'exon_skipping', 'rep': 'A', 'age': 'E14', 'psj': 1, 'gene_expr': 10},
        {'gene': 'gene1', 'sj_coord': 'chr1:1-2', 'as_event': 'exon_skipping', 'rep': 'B', 'age': 'E14', 'psj': 2, 'gene_expr': 10},
        {'gene': 'gene2', 'sj_coord': 'chr2:10-20', 'as_event': 'exon_inclusion', 'rep': 'A', 'age': 'E16', 'psj': 3, 'gene_expr': 30},
        {'gene': 'gene2', 'sj_coord': 'chr2:10-20', 'as_event': 'exon_inclusion', 'rep': 'B', 'age': 'E16', 'psj': 4, 'gene_expr': 30}]

df = pd.DataFrame(data) \
       .pivot_table(index=['gene', 'sj_coord', 'as_event', 'rep'], columns=['age'], values=['psj', 'gene_expr'])

After pivoting:

                                    gene_expr        psj     
age                                       E14   E16  E14  E16
gene  sj_coord   as_event       rep                          
gene1 chr1:1-2   exon_skipping  A        10.0   NaN  1.0  NaN
                                B        10.0   NaN  2.0  NaN
gene2 chr2:10-20 exon_inclusion A         NaN  30.0  NaN  3.0
                                B         NaN  30.0  NaN  4.0

I'd like to fill NaN under gene_expr with the actual data for each gene across different rep and age. Those information exists, just not in the input data I pivoted from. I wonder how I can do it within the pandas' ecosystem.

Thanks!

pandas • 1.4k views
ADD COMMENT
0
Entering edit mode

You want to fill, for example the first gene_expr, NaN (age = E16, rep =A, gene=gene1) with information you don't have in data ?

Those information exists, just not in the input data I pivoted from

Where do you have these infos ?

ADD REPLY
0
Entering edit mode

Perhaps it's better to use the contrived example below:

data = [{'gene': 'gene1', 'sj_coord': 'chr1:1-2', 'as_event': 'exon_skipping', 'rep': 'A', 'age': 'E14', 'psj': 1, 'gene_expr': 10},
        {'gene': 'gene1', 'sj_coord': 'chr1:10-20', 'as_event': 'exon_inclusion', 'rep': 'A', 'age': 'E16', 'psj': 2, 'gene_expr': 10}]


df = pd.DataFrame(data) \
       .pivot_table(index=['gene', 'sj_coord', 'as_event', 'rep'], columns=['age'], values=['psj', 'gene_expr'])

                                    gene_expr        psj     
age                                       E14   E16  E14  E16
gene  sj_coord   as_event       rep                          
gene1 chr1:1-2   exon_skipping  A        10.0   NaN  1.0  NaN
      chr1:10-20 exon_inclusion A         NaN  10.0  NaN  2.0

In this case, we identified two distinct splicing events (sj_coord) in the same gene: one found exclusively in E14 while the other in E16. We have gene_expr for gene1 in both ages, as shown in the pivoted table, but if an event is not identified in an age group, it's not reported in the input data. I would like to fill the values, either from within the pivoted table, or by supplying additional data structure to it.

Does this make more sense?

ADD REPLY
0
Entering edit mode

If you know what will be missing before pivoting the dataframe why not fullfill the dataframe with your new data then pivot ? I'm bothered by the fact that you want to fill the dataframe after the pivot.

I would like to fill the values, either from within the pivoted table, or by supplying additional data structure to it.

What do you have in mind ? You already have supplementary data in an other dataframe ?

ADD REPLY
0
Entering edit mode

If you know what will be missing before pivoting the dataframe why not fullfill the dataframe with your new data then pivot ?

That's what I ended up doing, which I think it was rather unnecessary. I thought that had to be a simple way to fill values post pivoting.

Thank you for your help.

ADD REPLY

Login before adding your answer.

Traffic: 2326 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6