python - Groupby value counts on the dataframe pandas

Question

Welcome To Ask or Share your Answers For Others

python - Groupby value counts on the dataframe pandas

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Groupby value counts on the dataframe pandas

I have the following dataframe:

df = pd.DataFrame([
    (1, 1, 'term1'),
    (1, 2, 'term2'),
    (1, 1, 'term1'),
    (1, 1, 'term2'),
    (2, 2, 'term3'),
    (2, 3, 'term1'),
    (2, 2, 'term1')
], columns=['id', 'group', 'term'])

I want to group it by id and group and calculate the number of each term for this id, group pair.

So in the end I am going to get something like this:

I was able to achieve what I want by looping over all the rows with df.iterrows() and creating a new dataframe, but this is clearly inefficient. (If it helps, I know the list of all terms beforehand and there are ~10 of them).

It looks like I have to group by and then count values, so I tried that with df.groupby(['id', 'group']).value_counts() which does not work because value_counts operates on the groupby series and not a dataframe.

Anyway I can achieve this without looping?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-16T21:20:10+0000

I use groupby and size

df.groupby(['id', 'group', 'term']).size().unstack(fill_value=0)

Timing

1,000,000 rows

df = pd.DataFrame(dict(id=np.random.choice(100, 1000000),
                       group=np.random.choice(20, 1000000),
                       term=np.random.choice(10, 1000000)))

Categories

python - Groupby value counts on the dataframe pandas

python - Groupby value counts on the dataframe pandas

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Timing

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags