Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
273 views
in Technique[技术] by (71.8m points)

pandas - try to delete some data in a column using python

im trying to remove some of the data in my world_rank column. Below are how i done it.

here preview:

     world_rank                        university_name  
0             1                     Harvard University   
1             2     California Institute of Technology   
2             3  Massachusetts Institute of Technology   
3             4                    Stanford University   
4             5                   Princeton University   
...         ...                                    ...   
2598    601-800                    Yeungnam University   
2599    601-800            Y?ld?z Technical University   
2600    601-800               Yokohama City University   
2601    601-800           Yokohama National University   
2602    601-800                     Yuan Ze University   

                       country  teaching international  research  citations  
0     United States of America      99.7          72.4      98.7       98.8   
1     United States of America      97.7          54.6      98.0       99.9   
2     United States of America      97.8          82.3      91.4       99.9   
3     United States of America      98.3          29.5      98.1       99.2   
4     United States of America      90.9          70.3      95.4       99.9   
...                        ...       ...           ...       ...        ...   
2598               South Korea      18.6          24.3      10.9       26.5   
2599                    Turkey      14.5          14.9       7.6       19.3   
2600                     Japan      24.0          16.1      10.2       36.4   
2601                     Japan      20.1          23.3      16.0       13.5   
2602                    Taiwan      16.2          17.7      18.3       28.6   

     income total_score  num_students  student_staff_ratio  
0      34.5        96.1       20152.0                  8.9   
1      83.7          96        2243.0                  6.9   
2      87.5        95.6       11074.0                  9.0   
3      64.3        94.3       15596.0                  7.8   
4         -        94.2        7929.0                  8.4   
...     ...         ...           ...                  ...   
2598   35.4           -       21958.0                 15.3   
2599     44           -       31268.0                 28.7   
2600   37.9           -        4122.0                  3.7   
2601   40.4           -       10117.0                 12.1   
2602   39.8           -        8663.0                 20.6   

     international_students female_male_ratio  year  
0                       25%               NaN  2011  
1                       27%      33 : 67 : 00  2011  
2                       33%           37 : 63  2011  
3                       22%          42:58:00  2011  
4                       27%          45:55:00  2011  
...                     ...               ...   ...  
2598                     3%          48:52:00  2016  
2599                     2%           36 : 64  2016  
2600                     3%               NaN  2016  
2601                     8%           28 : 72  2016  
2602                     4%          43:57:00  2016  

[2603 rows x 14 columns]

then i try to remove below data:

data.drop(["201-225","226-250","251-275","276-300","350-400","301-350","351-400","401-500","501-600","601-800"], inplace = True ) 

but this kind of error showed:

KeyError: "['201-225' '226-250' '251-275' '276-300' '350-400' '301-350' '351-400'
 '401-500' '501-600' '601-800'] not found in axis"

so, someone could help me with this. You can get the dataset here https://drive.google.com/file/d/1ozF5tX-JAWyy3YQd6_MbgsrGYDmi_W5_/view?usp=sharing

p/s: im new to phyton and btw im still students

question from:https://stackoverflow.com/questions/65872917/try-to-delete-some-data-in-a-column-using-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You may set the index with the world_rank, and then it will work fine:

data.set_index('world_rank', inplace = True)
data.drop(["201-225","226-250","251-275","276-300","350-400","301-350","351-400","401-500","501-600","601-800"], inplace = True )

#If you want, you can reset the index to the standard:
data.reset_index(inplace = True)

Note: I checked the original dataset, and there are two other segments included in there, that you may consider excluding as well: ['201-250', '251-300']


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...