Hi I have an excel sheet with Player names and Dates. For example:
Column A=[1000 1000 1001 1001 1001 1002 1002 1002 1002]
Column B=[03/12/2009 03/12/2009 04/01/2011 05/01/2010 08/02/2011 10/03/2012 05/12/2010 07/02/2011 09/03/2012 14/02/2013]
For each player name, I want to calculate the maximum length of time between the first and final date. I thought to perform this via a pandas df and then dictionary formation, but it does not seem to work. There must be some easier way to do this, but I can't find my way out. This is what I have tried so far:
import pandas as pd
from datetime import datetime
from itertools import count
from collections import defaultdict
Player_Dates = pd.read_excel(r'C:UsersPycharmProjectsProject1Data.xlsx', sheet_name='Sheet 1, header=0, na_values=['NA'], usecols = "B:C")
Player_Dates_new=Player_Dates.iloc[5:len(Player_Dates)]
Player_Dates_new.columns = ['Player_ID','Dates']
counts = {k: count(0) for k in Player_Dates_new.Player_ID.unique()}
d = defaultdict(dict)
for k, *v in Player_Dates_new.values.tolist():
d[k][next(counts[k])] = v
dict(d)
print(d, Player_Dates_new)
question from:
https://stackoverflow.com/questions/65846843/calculate-time-corresponding-to-rows-with-same-name-from-pandas-dataframe-in-py 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…