python - Pandas data frame - filter rows by values from another data frame

Question

Welcome To Ask or Share your Answers For Others

python - Pandas data frame - filter rows by values from another data frame

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas data frame - filter rows by values from another data frame

Assume I have two pandas data frames, that their relevant columns are:

stimuli data frame:

   stimuli_id    rank         
0     23          0  
1     27          1 
2     62          2 
3     88          2 
4     99          1

while 'stimuli_id' is a unique index, and 'rank' is a integer in range of [0,2]. Relevant columns from trials data frame is:

     stim1     stim2        
0     23         27
1     27         62   
2     62         99

While both stim1 and stim2 represent stimuli_id from stimuli data frame.

Now I want to filter all rows in trials data frame where the rank of the second stimuli is greater. So the example above after filtering should look like this:

       stim1     stim2        
0       62         99

So eventually only for this trial stim1 is greater than stim2, and the rest are not so we filter them.

I have tried the following:

trials.loc[stimuli.loc[stimuli["stimuli id"] == trials["stim1"]].iloc[0]["rank"] > stimuli.loc[stimuli["stimuli id"] == trials["stim2"]].iloc[0]["rank"]]

But a value error has been raised:

{ValueError}Can only compare identically-labeled Series objects

I have been searching for hours for any solution but found nothing helpful.

question from:https://stackoverflow.com/questions/66052516/pandas-data-frame-filter-rows-by-values-from-another-data-frame

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T03:10:14+0000

Since 'stimuli_id' is a unique key for that DataFrame, use the Series to map the different stim columns to ranks and check the comparison. (By "rank of the second stimuli is greater" I assume you mean a smaller number).

s = stimuli.set_index('stimuli_id')['rank']

trials[trials['stim2'].map(s) < trials['stim1'].map(s)]
#   stim1  stim2
#2     62     99

With mapping each column we are logically creating the mask with the following comparison:

#rank2      rank1
#    1   <      0   # False
#    2   <      1   # False
#    1   <      2   # True

Categories

python - Pandas data frame - filter rows by values from another data frame

python - Pandas data frame - filter rows by values from another data frame

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags