I have a dataframe as follows:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'key1' : ['a','a','b','b','a'], 'key2' : ['b', 'b', 'b', 'a', 'b'], 'val' : np.random.randint(10, size=5)})
>>> df
key1 key2 val
0 a b 9
1 a b 8
2 b b 2
3 b a 2
4 a b 1
I am trying to get the total sum of the val column where either key1=='a' or key2=='a'. Here is what I have:
>>> total = (df[(df['key1']=='a') | (df['key2']=='a')]).sum()
>>> total
key1 aaba
key2 bbab
val 20
dtype: object
I have two questions:
- How to only get the final value of the sum (i.e., here it’s 20)
- For a case with several columns, is there a more efficient way to do this operation?
question from:
https://stackoverflow.com/questions/65895279/pandas-filter-values-in-two-columns-and-sum 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…