groupby_fillna_for special columns

Well, there is no problem to merge data and fillna.

sometimes we want to fillna with groupby, more specifically, we need to fillna with different methods, some are ffill, while others are fillna(0).

the following code can deal with this situation.

# -*- coding: utf-8 -*-
“””
groupby_fillna_special columns
https://stackoverflow.com/questions/21284585
“””

import pandas as pd
from io import StringIO

# fillna with groupby
csvdata=”””
STK_ID RPT_Date sales opr_pft net_pft
002138 20130331 2.0703 0.3373 0.2829
002138 20130630 NaN NaN NaN
002138 20130930 7.4993 1.2248 1.1630
002138 20140122 NaN NaN NaN

600004 20130331 11.8429 3.0816 2.1637
600004 20130630 24.6232 6.2152 4.5135
600004 20130930 37.9673 9.2088 6.6463
600004 20140122 NaN NaN NaN

600809 20130331 27.9517 9.9426 7.5182
600809 20130630 40.6460 13.9414 9.8572
600809 20130930 53.0501 16.8081 11.8605
600809 20140122 NaN NaN NaN
“””
sio=StringIO(csvdata)

df=pd.read_csv(sio, sep=’\s+’,header=0,dtype={“STK_ID”:object,”RPT_Date”:object,
“sales”:float,’apr_pft’:float,
‘net_pft’:float})
df.set_index([‘STK_ID’,’RPT_Date’],inplace=True)

#ffill all columns with groupby
#DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)[source]
df.groupby(level=0).apply(lambda grp: grp.fillna(method=’ffill’))

#fillnow the lastrow with groupby
def f(g):
last = len(g.values)-1
g.iloc[last,:] = g.iloc[last-1,:]
return g
df.groupby(level=0).apply(f)

# fill certain columns
df[[‘sales’,’opr_pft’]]=df.groupby(level=0)[[‘sales’,’opr_pft’]].fillna(method=’ffill’)
#or you can write as following
df[[‘sales’,’opr_pft’]]=df.groupby(level=0)[[‘sales’,’opr_pft’]].apply(lambda grp: grp.fillna(method=’ffill’))

分享到: 微信 新浪微博 更多
Print Friendly, PDF & Email

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注