Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
427 views
in Technique[技术] by (71.8m points)

python - 计算列python中所有日期之间的时差(Calculate time difference between all dates in column python)

I have a data frame that looks like that:

(我有一个看起来像这样的数据框:)

group        date            value
 g_1  1/2/2019 11:03:00        3
 g_1  1/2/2019 11:04:00        5
 g_1  1/2/2019 10:03:32        100
 g_2  4/3/2019 09:11:09        46

I want to calculate the time difference between occurrences (in seconds) per group.

(我想计算每个组出现之间的时间差(以秒为单位)。)

Example output:

(输出示例:)

groups_time_diff = {'g_1': [23,5666,7878], 'g_2: [0.2,56,2343] ,...}

This is my code:

(这是我的代码:)

groups_time_diff = defaultdict(list)
for group in tqdm(groups):
    group_df = unit_df[unit_df['group'] == group]    
    dates = list(group_df['time'])
    while len(dates) != 0:
        min_date = min(dates)
        dates.remove(min_date)
        if len(dates) > 0:
            second_min_date = min(dates)
            date_diff = second_min_date - min_date
            groups_time_diff[group].append(date_diff.seconds)

This takes forever to run and I am looking for a more time efficient way to get the desired output.

(这需要永远运行,我正在寻找一种更省时的方法来获得所需的输出。)

Any ideas?

(有任何想法吗?)

  ask by anat translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Try at first sort your dates.

(首先尝试对您的日期进行排序。)

Then subtract these two series:

(然后减去这两个系列:)

dates = dates.sort_values()
pd.Series.subtract(dates[0:-1], dates[1:-1])

You are using min function twice in each iteration that is not efficient.

(您在每次迭代中都两次使用min函数,但效率不高。)

Hope this helps.

(希望这可以帮助。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...