pandas 技巧

时间:2019-11-01 11:36:35   收藏:0   阅读:82

 

find index of top 3 largest values of each column:

 

df1.apply(lambda s: pd.Series(s.nlargest(3).index))

 

map dataframe column

df["ItemIdx"] = df["question"].map(lambda x: itemMap.get(x,np.NaN))

 

load a dictionay from a save pkl file

with open ("l.pkl","rb") as f:
    itemMap= pickle.lead(f)

 

find the startpoint of each session (after sorted):

offset = np.zeros(df["sessinId"].nunique()+1,dtype=np.int32)
offset[1:] = df.groupby(‘sessinId‘).size().cumsum()

 

原文:https://www.cnblogs.com/pocahontas/p/11775756.html

评论(0
© 2014 bubuko.com 版权所有 - 联系我们:wmxa8@hotmail.com
打开技术之扣,分享程序人生!