data.table - R- How can I update column values only pertaining to rows where column contains beginning of a certain number?

Question

Welcome To Ask or Share your Answers For Others

data.table - R- How can I update column values only pertaining to rows where column contains beginning of a certain number?

asked Feb 5, 2021 in Technique[技术] by 深蓝 (71.8m points)

data.table - R- How can I update column values only pertaining to rows where column contains beginning of a certain number?

I have a dataset that looks as follows:

ex <- data.frame("id" =c(rep(1234,6)),
                 "case" = c(45,45,45,92,92,93),
                          "cpt" = c(20600,25263,8456,2345,9023,"C1245"),
                         "date"=c("2019-05-21", "2019-08-22","2019-04-12",
                                  "2019-03-01", "2019-02-18", "2019-03-12"),
                         "window"=c(10,20,NA,45,NA,NA),
                         "pay"=c(520,140,2200,230,600,700))

I essentially have two questions: How can I update the cpt column with the cpt value corresponding to the row with the highest pay grouped by case only focusing on any row containing a cpt value beginning with number 2 AND following this, how I can update the window column to replace any value that isn't the maximum with NA (also only focusing on rows where cpt value starts with 2)

I would like the final output to look like so:

ex1 <- data.frame("id" =c(rep(1234,6)),
                 "case" = c(45,45,45,92,92,93),
                 "cpt" = c(20600,20600,20600,2345,2345,"C1245"),
                 "date"=c("2019-05-21", "2019-08-22","2019-04-12",
                          "2019-03-01", "2019-02-18", "2019-03-12"),
                 "window"=c(10,NA,NA,45,NA,NA),
                 "pay"=c(520,140,2200,230,600,700))

I have two codes

ex[, cpt:=cpt[which.max(pay)], by=case]

ex %>% group_by(case) %>% mutate(window=replace(window, (pay < max(pay)), NA)) %>%
  setDT()

but both of these codes apply to the overall max value and will produce

ex2 <- data.frame("id" =c(rep(1234,6)),
                  "case" = c(45,45,45,92,92,93),
                  "cpt" = c(8456,8456,8456,9023,9023,"C1245"),
                  "date"=c("2019-05-21", "2019-08-22","2019-04-12",
                           "2019-03-01", "2019-02-18", "2019-03-12"),
                  "window"=c(NA,NA,NA,NA,NA,NA),
                  "pay"=c(520,140,2200,230,600,700))

Is there any way I could alter the codes I have to get the output I'm looking for? Or any other code that might be more efficient? Thanks in advance!

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-02-05T04:23:46+0000

We create a logical vector (i1) based on the first character of 'cpt' (substr - as the OP is interested in those elements that start with "2"), grouped by 'id', 'case', if there are any 'i1', then use that to subset the 'pay' get the index of the 'max' element with which.max and use that to subset the subset of 'cpt' (cpt[i1]) or else return the original column 'cpt', then we replace the 'window' where the 'cpt' is not the same as the duplicate column 'cpt1' (original column) to NA and ungroup

library(dplyr)
ex %>% 
  mutate(cpt1 = cpt) %>%
  group_by(id, case) %>% 
  mutate(i1 = substr(cpt, 1, 1) ==  "2", 
     cpt = if(any(i1)) cpt[i1][which.max(pay[i1])] else cpt,
     window = replace(window, cpt != cpt1, NA),
           i1 = NULL, cpt1 = NULL) %>%
    ungroup

-output

# A tibble: 6 x 6
#     id  case cpt   date       window   pay
#  <dbl> <dbl> <chr> <chr>       <dbl> <dbl>
#1  1234    45 20600 2019-05-21     10   520
#2  1234    45 20600 2019-08-22     NA   140
#3  1234    45 20600 2019-04-12     NA  2200
#4  1234    92 2345  2019-03-01     45   230
#5  1234    92 2345  2019-02-18     NA   600
#6  1234    93 C1245 2019-03-12     NA   700

Categories

data.table - R- How can I update column values only pertaining to rows where column contains beginning of a certain number?

data.table - R- How can I update column values only pertaining to rows where column contains beginning of a certain number?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags