Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
206 views
in Technique[技术] by (71.8m points)

Check whether combination of values from multiple dataframe's columns follow given pattern (rules) - pandas python

I have dataframe df_pattern containing pattern(rules):

df_pattern = pd.DataFrame({'SiteId': [4, 5, 6, 7, 8],
            'ZoneId': [1, 1, 1, 2, 2]})

That pattern must be followed by another dataframe:

df_checked = pd.DataFrame({'SiteId': [6, 5, 7, 4, 8, 7, 5, 8, 6],
            'ZoneId': [1, 1, 2, 2, 2, 2, 1, 1, 1]})

SiteId values 4,5,6 must be only associated with value 1 (ZoneId) and 7,8 with value 2. I'd love to find rows which doesnt follow this pattern. Therefore result should like this:

index SitedId ZoneId
3     4       2
7     8       1

Thank you.

question from:https://stackoverflow.com/questions/66059658/check-whether-combination-of-values-from-multiple-dataframes-columns-follow-giv

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  1. df_pattern create a pattern col to tag which one is pattern row,
  2. then left join df_checked
  3. by column pattern, we can find which row in df_checked is not the pattern row.
df_pattern['pattern'] = 1
dfn = pd.merge(df_checked, df_pattern, how='left')
print(dfn.loc[dfn.pattern.isnull(), ['SiteId','ZoneId']])

       SiteId  ZoneId
    3       4       2
    7       8       1

print(dfn)

       SiteId  ZoneId  pattern
    0       6       1      1.0
    1       5       1      1.0
    2       7       2      1.0
    3       4       2      NaN
    4       8       2      1.0
    5       7       2      1.0
    6       5       1      1.0
    7       8       1      NaN
    8       6       1      1.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...