Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
109 views
in Technique[技术] by (71.8m points)

python - Pandas select top 3 and append from another table

Hi I would like to take the top 3 numbers for each person and and append the reason_comments into them. If there is a tie, I would like to just take the first one.

May i know how can i do this in python please?

Table 1:
      id    VarA    VarB    VarC    VarD    VarE
        1   5        4       3       2       1
        2   4        6      21       5       5
        3   3        8      6        9       0
        4   7        8      23      44       0

Table 2: 
    reason_code reason_comment
    VarA        A is high
    VarB        B is high
    VarC        C is high
    VarD        D is high
    VarE        E is high


Results:
id  reason 1    reason 2    reason 3
1   A is high   B is high   C is high
2   C is high   B is high   D is high
3   D is high   B is high   C is high
4   D is high   C is high   B is high
question from:https://stackoverflow.com/questions/66058216/pandas-select-top-3-and-append-from-another-table

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

There are possible ties, so is necessary remove them. So you can reshape DataFrame by DataFrame.melt, sorting by DataFrame.sort_values and remove duplicated by DataFrame.drop_duplicates.

df1 = (df.melt('id')
        .sort_values(['id','value'], ascending=[True, False])
        .drop_duplicates(['id','value']))

Then for filter top3 is used GroupBy.cumcount for possible reuse it for new column names in DataFrame.pivot:

df1['g'] = df1.groupby('id').cumcount().add(1) 
    
df1 = df1[df1['g'].le(3)]

Also use Series.map for data by another DataFrame:

s = df2.set_index('reason_code')['reason_comment']
df1['variable'] = df1['variable'].map(s)

df1 = df1.pivot('id','g','variable').add_prefix('reason')

print (df)

g     reason1    reason2    reason3
id                                 
1   A is high  B is high  C is high
2   C is high  B is high  D is high
3   D is high  B is high  C is high
4   D is high  C is high  B is high

For convert id to column and remove g use:

df1 = df1.reset_index().rename_axis(None, axis=1)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...