I have a data frame that looks like that:
(我有一个看起来像这样的数据框:)
group date value
g_1 1/2/2019 11:03:00 3
g_1 1/2/2019 11:04:00 5
g_1 1/2/2019 10:03:32 100
g_2 4/3/2019 09:11:09 46
I want to calculate the time difference between occurrences (in seconds) per group.
(我想计算每个组出现之间的时间差(以秒为单位)。)
Example output:
(输出示例:)
groups_time_diff = {'g_1': [23,5666,7878], 'g_2: [0.2,56,2343] ,...}
This is my code:
(这是我的代码:)
groups_time_diff = defaultdict(list)
for group in tqdm(groups):
group_df = unit_df[unit_df['group'] == group]
dates = list(group_df['time'])
while len(dates) != 0:
min_date = min(dates)
dates.remove(min_date)
if len(dates) > 0:
second_min_date = min(dates)
date_diff = second_min_date - min_date
groups_time_diff[group].append(date_diff.seconds)
This takes forever to run and I am looking for a more time efficient way to get the desired output.
(这需要永远运行,我正在寻找一种更省时的方法来获得所需的输出。)
Any ideas? (有任何想法吗?)
ask by anat translate from so 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…