Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
490 views
in Technique[技术] by (71.8m points)

python - 如何根据肯定和否定关键字的数量对熊猫数据框中的文本进行分类(How to categorize text in a pandas dataframe based on the number of positive and negative keywords)

I have a pandas data frame which looks like this:

(我有一个熊猫数据框,看起来像这样:)

          Tweets                      negative_keywords positive_keywords
0   ?anl?urfa'da DAE? ile                       []            []
1   Hacettepe üni. Ara?t?rmas?                  []            []
2   Kad?na ?iddetin su? oldu?u                [su?]           []
3   Suriyeli'lerin fal bakabilme                []            []
4   Hastaneye git Suriyeli. PTT ye              []      [karde?i]

I want to add a new column which is called neutral_keywords.

(我想添加一个称为中性关键字的新列。)

If negative_keywords and positive_keywords are [] at the same time, neutral_keywords should take the value of [neutral].

(如果negative_keywords和positive_keywords同时为[],则中性关键字应采用[neutral]的值。)

Otherwise, it should take the value of [].

(否则,应采用[]的值。)

Then, I should add a new column based on the values of positive, negative and neutral.

(然后,我应该基于正值,负值和中性值添加一个新列。)

So, my new pandas data frame should look like this:

(因此,我的新熊猫数据框应如下所示:)

          Tweets                      negative_keywords positive_keywords keyword_category  keyword_category
0   ?anl?urfa'da DAE? ile                       []            []             [neutral]       neutral
1   Hacettepe üni. Ara?t?rmas?                  []            []             [neutral]       neutral 
2   Kad?na ?iddetin su? oldu?u                [su?]           []                []           negative
3   Suriyeli'lerin fal bakabilme                []            []             [neutral]       neutral
4   Hastaneye git Suriyeli. PTT ye              []         [karde?i]           []            positive

How can I do that?

(我怎样才能做到这一点?)

  ask by Reyhan translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
# define a function which returns True if all are empty lists
all_empty = lambda x: all(not lst for lst in x)
# apply function to the two columns to create a mask
mask = df[['negative_keywords', 'positive_keywords']].apply(all_empty, axis=1)
# initialize the neutral_keywords column
df['neutral_keywords'] = [[]] * len(mask)
# update the neutral_keywords column where the mask is True
df.loc[mask, 'neutral_keywords'] = [['neutral']] * mask.sum()
df

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...