Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
83 views
in Technique[技术] by (71.8m points)

python - Pandas filter values in two columns and sum?

I have a dataframe as follows:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'key1' : ['a','a','b','b','a'], 'key2' : ['b', 'b', 'b', 'a', 'b'], 'val' : np.random.randint(10, size=5)})
>>> df
  key1 key2  val
0    a    b    9
1    a    b    8
2    b    b    2
3    b    a    2
4    a    b    1

I am trying to get the total sum of the val column where either key1=='a' or key2=='a'. Here is what I have:

>>> total = (df[(df['key1']=='a') | (df['key2']=='a')]).sum()
>>> total
key1    aaba
key2    bbab
val       20
dtype: object

I have two questions:

  1. How to only get the final value of the sum (i.e., here it’s 20)
  2. For a case with several columns, is there a more efficient way to do this operation?
question from:https://stackoverflow.com/questions/65895279/pandas-filter-values-in-two-columns-and-sum

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  1. Pass only the column you want to calculate sum:
df.loc[(df['key1']=='a') | (df['key2']=='a'), 'val'].sum()
# out
# 20
  1. For several columns:
cols = ['key1','key2']

df.loc[df[cols].eq('a').any(1), 'val'].sum()
# same out
# 20

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...