If the static data isn't guaranteed to have sequential dates, etc. you can use a groupby
to the last record that matches the day of week and month. If there isn't any data matching that day of week, this will give a KeyError
.
Here is your initial df:
df = pd.DataFrame({'Date': {0: pd.to_datetime('2021-01-01'),
1: pd.to_datetime('2021-01-02'),
2: pd.to_datetime('2021-01-03'),
3: pd.to_datetime('2021-01-04'),
4: pd.to_datetime('2021-01-05'),
5: pd.to_datetime('2021-01-06'),
6: pd.to_datetime('2021-01-07'),
7: pd.to_datetime('2021-01-08')},
'Day': {0: 'Fri', 1: 'Sat', 2: 'Sun', 3: 'Mon',
4: 'Tue', 5: 'Wed', 6: 'Thu', 7: 'Fri'},
'Temp': {0: 20, 1: 17, 2: 17, 3: 16, 4: 20,
5: 16, 6: 21, 7: 19}})
If you want to roll off the first record, you can do that multiple ways... this uses shift()
, although that will make the df['Temp']
column a float because of the NaN, if that matters.
df = df.shift(-1).dropna()
The next part builds lookup tables from your static data.
- The first is indexed by month and day_of_week, and retains the most recent value of each pair -- I think that is what you wanted. If you really did want the oldest date, just use
first
instead of last
.
- The second one,
lookup2
, drops the month index and gives you data to fall back to to get the most recent matching day_of_week, ignoring the month.
Code:
lookup = df.groupby([df['Date'].dt.month, df['Date'].dt.day_of_week]).last()
lookup2 = lookup.droplevel(0)
lookup2 = lookup2[~lookup2.index.duplicated(keep='last')]
The code to insert records is below. I enclosed it in a loop that adds the next day each time so you can test it adding multiple days in a row:
ndays = 5
for i in range(ndays):
next_date = df.iloc[-1]['Date'] + pd.Timedelta("1D")
try:
temp = lookup.loc[(next_date.month, next_date.day_of_week), 'Temp']
except KeyError:
temp = lookup2.loc[next_date.day_of_week, 'Temp']
df = df.append( {'Date' : next_date, 'Day' : next_date.strftime("%a"), 'Temp' : temp}, ignore_index=True)
print(df)
Date Day Temp
0 2021-01-02 Sat 17.0
1 2021-01-03 Sun 17.0
2 2021-01-04 Mon 16.0
3 2021-01-05 Tue 20.0
4 2021-01-06 Wed 16.0
5 2021-01-07 Thu 21.0
6 2021-01-08 Fri 19.0
7 2021-01-09 Sat 17.0
8 2021-01-10 Sun 17.0
9 2021-01-11 Mon 16.0
10 2021-01-12 Tue 20.0
11 2021-01-13 Wed 16.0
In any case, hopefully this is at least on the right track for what you asked, and you can modify it to suit your needs.