Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
381 views
in Technique[技术] by (71.8m points)

r - Transforming numerical variables in a dataframe with a function

Background

This is an attempt to improve a previous question. The idea is to create a function where I pass a dataframe and optionally a vector with variable names, the function then iterate over the variables in the dataframe, if they are numeric they are transformed. If the vector of names is also passed, only the ones in the list are iterated.

Tools used

  • In order to create an "optional" argument I used the missing() function. Source.

  • The syntax to iterate over the vector was inspired from this dicussion here.

Code & where I am stuck:

transformDivideThousand <- function(data_frame, listofvars){
    if (missing(listofvars)) {
        data_frame[, sapply(data_frame, is.numeric)] =
        data_frame[, sapply(data_frame, is.numeric)]/1000
    } else {
        for (i in names(data_frame)) {
            for (i in listofvars) {
                data_frame[[i]]<-data_frame[[i]]/1000
            }
        }
    }
    return(data_frame)
}

The call would look like:

test <- transformDivideThousand(cases, c("col2", "col3", "col15"))

Question

  • What I am getting wrong on that code? I managed to make the optional argument work, but there is something wrong in the code. When I test it the variables from the list are converted to zeros.

Cautionary suggestion

  • If you are down-voting the question, at very least justify why!
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You may do:

# data
 head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

the function

foo_divide <- function(x, y){
  foo <- function(x) if(is.numeric(x)) x/1000 else x # function to divide numeric columns by 1000
  if(missing(y)) y <- 1:ncol(x) # set y if missing
  x[, y] <-  lapply(x[, y], foo)
  as.data.frame(x) # return
}

no listofvars

head(foo_divide(iris))
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1       0.0051      0.0035       0.0014       2e-04  setosa
2       0.0049      0.0030       0.0014       2e-04  setosa
3       0.0047      0.0032       0.0013       2e-04  setosa
4       0.0046      0.0031       0.0015       2e-04  setosa
5       0.0050      0.0036       0.0014       2e-04  setosa
6       0.0054      0.0039       0.0017       4e-04  setosa

plus listofvars

 head(foo_divide(iris, c("Petal.Length", "Petal.Width", "Species")))
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5       0.0014       2e-04  setosa
2          4.9         3.0       0.0014       2e-04  setosa
3          4.7         3.2       0.0013       2e-04  setosa
4          4.6         3.1       0.0015       2e-04  setosa
5          5.0         3.6       0.0014       2e-04  setosa
6          5.4         3.9       0.0017       4e-04  setosa

You can also use a numeric vector to specify the columns

foo_divide(iris, 1:3)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...