I have a dataframe having the following shape
colA colB colC specialCol
1 2 3 old
1 2 3 old
1 1 2 old
1 2 3 new
2 3 4 new
I want to drop duplicates considering colA, colB & colC (also I should not ignore the specialCol. It should be unique rather than duplicate). My intented output is
colA colB colC specialCol
1 2 3 old
1 1 2 old
2 3 4 new
Only one 1 2 3 should be removed from old & new and it should not compare two olds/two news. How to acheive this in Pandas?
EDIT:
May be even the following will do good for me. I have two dataframes and I am merging it. While merging it should remove duplicates. But for one duplicate row in frame 1 it should remove only one duplicate row in frame 2 like below
df1:
colA colB colC
1 2 3
1 5 4
1 1 2
df2:
colA colB colC
1 2 3
1 2 3
1 1 2
1 1 2
Result of df1 & df2 merge:
colA colB colC
1 2 3
1 5 4
1 1 2
Note that only one row of 1 2 3 & 1 1 2 are removed because only one copy is available in df1. If there exists two copies in df1 then both copies of 1 2 3 would be removed.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…