I'm trying to use dynamic variables in a for loop to access a tablename. Other questions on SO (eg here, here and here) seem to be about using dynamic variables to access a column name. I'm using R v4.0.3 and dplyr v1.0.2.
Basically, I'm importing from a .sav (SPSS) file, and seeking to split the 400+ columns into smaller dataframes holding the info for each survey question. That part works, but I then want to do things like add a mean column for each new dataframe. I'm currently trying to do that in the segmentation for loop, but I can't get it to work. (I'd also be happy to do it separately for each new dataframe in another for loop or a list or something, but I can't see how that would work either if I can't get the other one to work!)
Simplifying somewhat, the columns in the original file are named as QX.Y_Z, where Z are items within question Y in the X block.
Some dummy data, setting up a (sav-type) dataframe of 2 questions, each with two items:
mydata=tibble(Q6.1_1_1=as.numeric(c(2, 1, 3, 1, 2, 3, 1, 3, 2, 1, 1, 1, 2, 2)),
Q6.1_1_2=as.numeric(c(1, 3, 1, 1, 1, 2, 3, 3, 1, 3, 1, 1, 1, 2)),
Q7.1_1_1=as.numeric(c(1, 2, 1, 2, 1, 3, 3, 1, 2, 3, 2, 1, 3, 2)),
Q7.1_1_2=as.numeric(c(3, 1, 3, 1, 2, 1, 3, 2, 3, 1, 3, 1, 1, 3)),
)
var_label(mydata$Q6.1_1_1)<-"Rate your effort - before."
var_label(mydata$Q6.1_1_2)<-"Rate your effort - before."
var_label(mydata$Q7.1_1_1)<-"Rate your enthusiasm - before."
var_label(mydata$Q7.1_1_2)<-"Rate your enthusiasm - after."
val_labels(mydata$Q6.1_1_1)<-c(Low=1, Medium=2, High=3)
val_labels(mydata$Q6.1_1_2)<-c(Low=1, Medium=2, High=3)
val_labels(mydata$Q7.1_1_1)<-c(Low=1, Medium=2, High=3)
val_labels(mydata$Q7.1_1_2)<-c(Low=1, Medium=2, High=3)
mydata
# A tibble: 14 x 4
Q6.1_1_1 Q6.1_1_2 Q7.1_1_1 Q7.1_1_2
<dbl+lbl> <dbl+lbl> <dbl+lbl> <dbl+lbl>
1 2 [Medium] 1 [Low] 1 [Low] 3 [High]
2 1 [Low] 3 [High] 2 [Medium] 1 [Low]
3 3 [High] 1 [Low] 1 [Low] 3 [High]
4 1 [Low] 1 [Low] 2 [Medium] 1 [Low]
5 2 [Medium] 1 [Low] 1 [Low] 2 [Medium]
6 3 [High] 2 [Medium] 3 [High] 1 [Low]
7 1 [Low] 3 [High] 3 [High] 3 [High]
8 3 [High] 3 [High] 1 [Low] 2 [Medium]
9 2 [Medium] 1 [Low] 2 [Medium] 3 [High]
10 1 [Low] 3 [High] 3 [High] 1 [Low]
11 1 [Low] 1 [Low] 2 [Medium] 3 [High]
12 1 [Low] 1 [Low] 1 [Low] 1 [Low]
13 2 [Medium] 1 [Low] 3 [High] 1 [Low]
14 2 [Medium] 2 [Medium] 2 [Medium] 3 [High]
Remove the item number from the question string:
varlist<-mydata %>%
colnames() %>%
as_tibble() %>%
separate(value, "qno", sep="_", extra = "drop", fill="right") %>%
unique() %>%
pull()
> varlist
[1] "Q6.1" "Q7.1"
Generate the subtables:
for (v in varlist) {
assign(paste0("table", v), select(mydata, matches(v)))
}
This gives me subtables called tableQ6.1 and tableQ7.1. So far, so good.
However, when I try to add a mean column (giving the mean of each row) for each of these subtables as they are generated, I can't find a way to tell mutate() to use the dynamic name of the table. These are a couple of the options I've tried, but all I get (with these and a LOT more) is errors, so I must be missing something obvious:
for (v in varlist) {
assign(paste0("table", v), select(mydata, matches(v)))
tabname<-sym(paste0("table", v))
mutate({{tabname}}, mean=rowMeans(across(where(is.numeric)), na.rm = FALSE))
}
for (v in varlist) {
assign(paste0("table", v), select(mydata, matches(v)))
tabname<-"table{v}" %>%
mutate("mean{v}":=rowMeans(across(where(is.numeric)), na.rm = FALSE))
}
Any guidance (including wider comments on whether this is the best approach) would be welcome!
question from:
https://stackoverflow.com/questions/65644014/r-dplyr-using-dynamic-variables-to-access-a-table-name