I have a dataset where the values are collapsed so each row has multiple inputs per one column.
For example:
Gene Score1
Gene1 NA, NA, NA, 0.03, -0.3
Gene2 NA, 0.2, 0.1, ., .
I am looking to make 2 new columns that select the min and max values of that column. In reality I have 70 columns so I coded to get all the min and max columns at once with:
get_range <- function(x) {
x <- type.convert(str_split(x, ",\s+", simplify = TRUE), na.strings = ".")
x <- t(apply(x, 1L, function(i) {
i <- i[!is.na(i)]
if (length(i) < 1L) c(NA_real_, NA_real_) else range(i)
}))
dimnames(x)[[2L]] <- c("min", "max")
x
}
dt <- dt[, c(Gene = .(Gene), lapply(.SD, get_range)), .SDcols = -"Gene"]
However, my min and max columns outputted from the code look like this:
Gene Score1.min Score1.max
Gene1 1 5
Gene2 3 5
Expected output actually is:
Gene Score1.min Score1.max
Gene1 -0.3 0.03
Gene2 0.1 0.2
The values are nothing like the actual values I had at the start, I have no idea how my code is getting these as the output - is there something my code making the values no longer be treated as the numbers they originally were?
Input data:
structure(list(Gene = c("Gene1", "Gene2"), Score1 = c("NA, NA, NA, 0.03, -0.3",
"NA, 0.2, 0.1, ., .")), row.names = c(NA, -2L), class = c("data.table",
"data.frame"))
question from:
https://stackoverflow.com/questions/65940361/how-to-get-the-min-and-max-values-of-a-column