Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
155 views
in Technique[技术] by (71.8m points)

python - How to use numpy diff for non out[i] = a[i+1] - a[i] differences?

I have data which looks like this, I got some type and some timestamps, that I want to subtract from each other.

import numpy as np
import pandas as pd
data = [["a",12],["a",13],["a",15],["b",32],["b",34],["b",37]]
df = pd.DataFrame(data)
df.columns = ['type', 'time']
df["diff"] = df.groupby("type")["time"].diff()
df

  type time diff
0  a  12 NaN
1  a  13 1.0
2  a  15 2.0
3  b  32 NaN
4  b  34 2.0
5  b  37 3.0

But others than the default, I want to compare every timestamp (1,2;4,5) to the first timestamp of the type series, so the diff of line 2 and 5 should be 3.0 and 5.0. How could I solve this? Thanks!

question from:https://stackoverflow.com/questions/65892501/how-to-use-numpy-diff-for-non-outi-ai1-ai-differences

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think using .cumsum() would suffice.

import numpy as np
import pandas as pd
data = [["a",12],["a",13],["a",15],["b",32],["b",34],["b",37]]
df = pd.DataFrame(data)
df.columns = ['type', 'time']
df["diff"] = df.groupby("type")["time"].diff().fillna(0)
df["diff"] = df.groupby("type")["diff"].cumsum()
print(df)
>>>
  type  time  diff
0    a    12   0.0
1    a    13   1.0
2    a    15   3.0
3    b    32   0.0
4    b    34   2.0
5    b    37   5.0

BTW, this code also works:

import numpy as np
import pandas as pd
data = [["a",12],["a",13],["a",15],["b",32],["b",34],["b",37]]
df = pd.DataFrame(data)
df.columns = ['type', 'time']
df["diff"] = df['time'] - df.groupby("type")["time"].transform('first')
print(df)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...