Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
499 views
in Technique[技术] by (71.8m points)

lubridate - Check if a date is within an interval in R

I have these three intervals defined:

YEAR_1  <- interval(ymd('2002-09-01'), ymd('2003-08-31'))
YEAR_2  <- interval(ymd('2003-09-01'), ymd('2004-08-31')) 
YEAR_3  <- interval(ymd('2004-09-01'), ymd('2005-08-31'))

(in real life, I have 50 of these)

I have a dataframe (called df) with a column full of lubridate formatted dates.

I'd like to append a new column on df which has the appropriate value YEAR_n, depending on which interval the date falls within.

Something like:

df$YR <- ifelse(df$DATE %within% YEAR_1, 1, NA)

but I'm not sure how to proceed. I need to somehow use an apply I think?

Here's my dataframe:

structure(c(1055289600, 1092182400, 1086220800, 1074556800, 1109289600, 
1041897600, 1069200000, 1047427200, 1072656000, 1048636800, 1092873600, 
1090195200, 1051574400, 1052179200, 1130371200, 1242777600, 1140652800, 
1137974400, 1045526400, 1111104000, 1073952000, 1052870400, 1087948800, 
1053993600, 1039564800, 1141603200, 1074038400, 1105315200, 1060560000, 
1072051200, 1046217600, 1107129600, 1088553600, 1071619200, 1115596800, 
1050364800, 1147046400, 1083628800, 1056412800, 1159747200, 1087257600, 
1201478400, 1120521600, 1066176000, 1034553600, 1057622400, 1078876800, 
1010880000, 1133913600, 1098230400, 1170806400, 1037318400, 1070409600, 
1091577600, 1057708800, 1182556800, 1091059200, 1058227200, 1061337600, 
1034121600, 1067644800, 1039478400, 1022198400, 1063065600, 1096329600, 
1049760000, 1081728000, 1016150400, 1029801600, 1059350400, 1087257600, 
1181692800, 1310947200, 1125446400, 1057104000, NA, 1085529600, 
1037664000, 1091577600, 1080518400, 1110758400, 1092787200, 1094601600, 
1169424000, 1232582400, 1058918400, 1021420800, 1133136000, 1030320000, 
1060732800, 1035244800, 1090800000, 1129161600, 1055808000, 1060646400, 
1028678400, 1075852800, 1144627200, 1111363200, 1070236800), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can use walk from package purrr for this:

purrr::walk(1:3, ~(df$Year[as.POSIXlt(df$DATE) %within% get(paste0("YEAR_", .))] <<- .))

or maybe you should write a loop to improve readability (unless taboo for you):

df$YR <- NA
for(i in 1:3){
  interval <- get(paste0("YEAR_", i))
  index <-which(as.POSIXlt(df$DATE) %within% interval)
  df$YR[index] <- i
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...