Welcome To Ask or Share your Answers For Others

python - Pandas filter values in two columns and sum?

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas filter values in two columns and sum?

I have a dataframe as follows:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'key1' : ['a','a','b','b','a'], 'key2' : ['b', 'b', 'b', 'a', 'b'], 'val' : np.random.randint(10, size=5)})
>>> df
  key1 key2  val
0    a    b    9
1    a    b    8
2    b    b    2
3    b    a    2
4    a    b    1

I am trying to get the total sum of the val column where either key1=='a' or key2=='a'. Here is what I have:

>>> total = (df[(df['key1']=='a') | (df['key2']=='a')]).sum()
>>> total
key1    aaba
key2    bbab
val       20
dtype: object

I have two questions:

How to only get the final value of the sum (i.e., here it’s 20)
For a case with several columns, is there a more efficient way to do this operation?

question from:https://stackoverflow.com/questions/65895279/pandas-filter-values-in-two-columns-and-sum

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Welcome To Ask or Share your Answers For Others

1 Answer

answered Oct 7, 2021 by 深蓝 (71.8m points)

Pass only the column you want to calculate sum:

df.loc[(df['key1']=='a') | (df['key2']=='a'), 'val'].sum()
# out
# 20

For several columns:

cols = ['key1','key2']

df.loc[df[cols].eq('a').any(1), 'val'].sum()
# same out
# 20

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

...