python - pandas category that includes the closest greater value

Question

Welcome To Ask or Share your Answers For Others

python - pandas category that includes the closest greater value

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - pandas category that includes the closest greater value

I have the following dataframe:

df = pd.DataFrame({'id': ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c','c','c'], 'cumsum': [1, 3, 6, 9, 10, 4, 9, 11, 13, 5, 8, 19]})


   id   cumsum
0   a   1
1   a   3
2   a   6
3   a   9
4   a   10
5   b   4
6   b   9
7   b   11
8   b   13
9   c   5
10  c   8
11  c   19

I would like to get a new column with a category such that, for a specific input, for each id it will take the closest greater (or equal) value to be in the first category.

For example:

input = 8

desired output:

    id  cumsum  category
0   a   1   0
1   a   3   0
2   a   6   0
3   a   9   0
4   a   10  1
5   b   4   0
6   b   10  0
7   b   11  1
8   b   13  1
9   c   5   0
10  c   8   0
11  c   19  1

question from:https://stackoverflow.com/questions/65881093/pandas-category-that-includes-the-closest-greater-value

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T18:54:41+0000

You can get first value greater of equal by input by GroupBy.first and filtered by Series.ge, then compare by Series.gt mapped values by Series.map with Id and last convert mask to integers:

val = 8

s = df[df['cumsum'].ge(val)].groupby('id')['cumsum'].first()

df['category'] = df['cumsum'].gt(df['id'].map(s)).astype(int)
print (df)
   id  cumsum  category
0   a       1         0
1   a       3         0
2   a       6         0
3   a       9         0
4   a      10         1
5   b       4         0
6   b       9         0
7   b      11         1
8   b      13         1
9   c       5         0
10  c       8         0
11  c      19         1

Another idea is use Series.where with GroupBy.transform:

val = 8

s1 = df['cumsum'].where(df['cumsum'].ge(val)).groupby(df['id']).transform('min')
#alternative
s1 = df['cumsum'].where(df['cumsum'].ge(val)).groupby(df['id']).transform('first')

df['category'] =  df['cumsum'].gt(s1).astype(int)

Categories

python - pandas category that includes the closest greater value

python - pandas category that includes the closest greater value

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags