Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
91 views
in Technique[技术] by (71.8m points)

python - Create a new variable in a dataframe based on an existing variable

Using a dataframe, I am trying to create a new string variable called CLASS based on RISK_RATING values. If the RISK_RATING values contain ‘PEP’, ‘PEP’ will be the CLASS value. Otherwise, ‘SF’ will be the CLASS name. If there is no RISK_RATING value, the CLASS name is to be 'missing'

Here is a sample of my dataframe:
    BUSINESS    CUSTOMER_ID   RISK_RATING
0   PVB          1000033280     HR
1   PVB          1000166304     PEP (SR)
2   PVB          1004006928     PEP (SR)
3   PVB          1004006936     PEP (SR)
question from:https://stackoverflow.com/questions/65924140/create-a-new-variable-in-a-dataframe-based-on-an-existing-variable

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I have simplified it as much as possible, try the below:

import numpy as np
df['CLASS'] = np.where(df['RISK_RATING'].str.contains('PEP'),'PEP','SF')
df['CLASS'] = np.where(df['RISK_RATING'].isnull(),'missing',df['CLASS'])

Will get you:

  BUSINESS  CUSTOMER_ID RISK_RATING CLASS
0      PVB   1000033280          HR    SF
1      PVB   1000166304    PEP (SR)   PEP
2      PVB   1004006928    PEP (SR)   PEP
3      PVB   1004006936    PEP (SR)   PEP

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...