I am working on a retail data set which has following sample dataset:
I am working on creating an extra column which adds total charge of
1.Any of Two random unique Ref_ID's & has same Month.
2.Once the random Ref_ID are picked it cannot be repeated for next addition of charge(No duplicates allowed)
3.Month values between - (202001, 202002, 202003, 202004, 202005, 202006)
I started working on the code, right now its in the following shape:
data = data.loc[data['service_availed'] == '2 to 3 Connections']
i=np.random.randint(1, data.shape[0])
j=np.random.randint(1, data.shape[0])
print(i,j)
print(data.Ref_ID.iloc[i])
print(data.Ref_ID.iloc[j])
print(data.Month.iloc[i])
print(data.Month.iloc[j])
for k in [202001, 202002, 202003, 202004, 202005, 202006]:
if (data.month.iloc[i] == data.month.iloc[j]):
data['Total_bill'] = data.Charge.iloc[i] + data.Charge.iloc[j]
I am stuck here, could anyone help me or give a lead on how to avoid duplicate Ref_ID
& create a loop for all the Ref_ID
consider above points mentioned. If this can be achieved by assigning a unique ID, please do advice.
question from:
https://stackoverflow.com/questions/65902866/mapping-values-in-a-dataframe-with-random-valuesnon-repetitive 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…